Overview
This project does something that feels like it should need a heavy model, but doesn't: it watches a camera feed and reacts the instant something moves. There is no neural network here at all - just the difference between two frames and a little pixel maths.
The script reads the webcam frame by frame, keeps a short rolling buffer of recent frames, and compares the oldest against the newest. Where nothing changed the difference is near black; where something moved it lights up. Those bright regions get boxed in red, the frame is stamped with a "Burglar Detected!" alert, and a timestamped snapshot is saved to disk - throttled so it doesn't write hundreds of near-identical images.
This is the lightweight counterpart to the Computer Vision with OpenCV & YOLOv8 project, which covers the OpenCV basics and the YOLOv8 detection this leans on - and a reminder that not every vision problem needs a model.
Frame differencing flags the moving region in red and prints the alert in real time - no model required, just the difference between two frames. This is a screen recording of the script below actually running.
A few core concepts first
Before the code, the three ideas this project rests on.
Frame differencing - motion as a subtraction
The whole detector is built on one observation: if you subtract one frame
from another a few steps later, the parts that stayed still cancel out to near
black, and only the parts that moved are left bright. OpenCV's
cv2.absdiff(a, b) gives that absolute pixel-by-pixel difference. No training,
no model - motion is just a subtraction.
Grayscale and blur - killing the noise
Cameras are noisy: even a perfectly still scene flickers slightly pixel to pixel, and that flicker would register as fake "motion". So before comparing, each frame is converted to grayscale (motion only needs brightness, not colour) and Gaussian blurred to smooth out tiny flicker. Only real, sizeable movement survives that smoothing.
Threshold → dilate → contour
The raw difference is a fuzzy gray image. To turn it into clean "this region moved" boxes it goes through three steps: threshold (anything brighter than a cutoff becomes pure white, the rest black), dilate (fatten the white blobs so nearby motion pixels merge into one solid region), and findContours (trace the outline of each white region). A minimum-area filter then ignores small blobs
- a leaf, a shadow - so only something sizeable triggers the alert.
Because there is no model, this detector doesn't know what moved - only that something did. That is the trade-off: near-zero cost and no training, in exchange for not telling a person from a passing cat. Pairing it with YOLOv8 would add the "what".
The full script
The complete program - open the camera, diff each pair of frames, box the motion, alert and snapshot. The walkthrough underneath takes it line by line.
import cv2, os, time
os.makedirs("captures", exist_ok=True)
cam = cv2.VideoCapture(0)
if not cam.isOpened():
print("Cannot open camera")
exit()
frames, gap, last_saved = [], 5, 0
while True:
ok, frame = cam.read()
if not ok:
break
gray = cv2.GaussianBlur(cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY), (21, 21), 0)
frames.append(gray)
if len(frames) > gap + 1:
frames.pop(0)
motion = False
if len(frames) >= gap:
diff = cv2.absdiff(frames[0], frames[-1])
_, thresh = cv2.threshold(diff, 30, 255, cv2.THRESH_BINARY)
thresh = cv2.dilate(thresh, None, iterations=2)
contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
for c in contours:
if cv2.contourArea(c) < 500:
continue
motion = True
x, y, w, h = cv2.boundingRect(c)
cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 0, 255), 2)
if motion:
cv2.putText(frame, "Burglar Detected!", (10, 30),
cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 2)
if time.time() - last_saved > 3:
cv2.imwrite(f"captures/burglar_{int(time.time())}.jpg", frame)
print("Image saved.")
last_saved = time.time()
cv2.imshow("Burglar Detection", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cam.release()
cv2.destroyAllWindows()
Line by line
Setup
import cv2, os, timebrings in OpenCV (frames, drawing, the diff),os(make the snapshot folder) andtime(timestamps and the save throttle).os.makedirs("captures", exist_ok=True)makes the folder for saved snapshots if it isn't there yet;exist_ok=Truemeans it won't error if it already is.cv2.VideoCapture(0)opens the default camera (device0). Pass a filename instead and the same loop runs on a recorded video.if not cam.isOpened():guards against a missing or busy camera and exits with a clear message instead of failing silently.frames, gap, last_saved = [], 5, 0-framesis a rolling buffer of recent grayscale frames,gapis how many frames apart we compare,last_savedthrottles how often we write to disk.
The frame loop
ok, frame = cam.read()grabs the next frame;oksays whether it worked.if not ok: breakexits cleanly when the stream ends.gray = cv2.GaussianBlur(cv2.cvtColor(frame, COLOR_BGR2GRAY), (21,21), 0)converts to grayscale and blurs it - blurring kills tiny pixel flicker so only real motion survives.frames.append(gray)thenif len(frames) > gap + 1: frames.pop(0)keeps the buffer at a fixed length - the oldest frame falls off the front as each new one arrives.
Detecting motion
if len(frames) >= gap:waits until the buffer has filled before comparing - the first few frames have nothing to diff against.cv2.absdiff(frames[0], frames[-1])is the core idea: the absolute pixel difference between the oldest and newest frame. Where nothing moved → near black; where something moved → bright.cv2.threshold(diff, 30, 255, THRESH_BINARY)turns that into pure black/white: any change above30becomes white (motion), the rest black.cv2.dilate(thresh, None, iterations=2)fattens the white blobs so nearby motion pixels merge into one solid region.cv2.findContours(...)finds the outlines of those white regions.
Per-region drawing
for c in contours:walks each moving region.if cv2.contourArea(c) < 500: continueignores small blobs - noise, a leaf, a shadow - and only reacts to something sizeable.cv2.boundingRect(c)gets a box around the motion;cv2.rectangle(...)draws it in red ((0, 0, 255)in OpenCV's BGR colour order).motion = Truerecords that this frame had real movement, which triggers the alert below.
Alerting and snapshotting
cv2.putText(frame, "Burglar Detected!", ...)stamps the alert in red across the top of the frame.if time.time() - last_saved > 3:is the throttle - only save a snapshot if more than 3 seconds have passed since the last one, so we don't write hundreds of near-identical frames.cv2.imwrite(f"captures/burglar_{int(time.time())}.jpg", frame)writes a timestamped JPG, thenlast_saved = time.time()resets the throttle.
Showing and cleanup
cv2.imshow("Burglar Detection", frame)displays the annotated frame.if cv2.waitKey(1) & 0xFF == ord('q'): breakwaits 1 ms for a key and quits onq- the standard way to make a real-time OpenCV window closable.cam.release()andcv2.destroyAllWindows()free the camera and close the window when the loop ends.
Key insight
The lesson here is that not every vision problem needs a neural network. A
genuinely useful burglar detector is just the difference between two frames:
absdiff to find what moved, threshold + dilate + findContours to clean it
into regions, and a minimum-area filter to ignore noise. It runs on anything,
needs no training data, and costs almost nothing.
The flip side is what it can't do: with no model it knows that something moved, never what. It can't tell a person from a pet or a swaying branch from an intruder. The natural upgrade is to feed the motion regions into the YOLOv8 detector from the companion Computer Vision with OpenCV & YOLOv8 project - cheap motion-gating first, then a model only when something actually moves.
Tech stack
- Python 3.12
- OpenCV (
cv2) - camera I/O, grayscale, blur,absdiff,threshold,dilate,findContours, drawing and display - No model / no training - pure frame differencing
osandtime- snapshot folder, timestamps and the save throttle
Reference
- OpenCV -
absdiff- Core array operations - OpenCV - Contours - Contours: Getting Started
- OpenCV - Image Thresholding - Thresholding tutorial