Monday, March 12, 2007

type tracking improved 2

able to better track the type using tug-of-war. when the object at the head of the worm is classified as a car, the tug-of-war is pushed 0.5 positive. when the object is a person, it's pulled -1. the overall type of the worm is car if the cars are winning, person if the people are winning. this works because people tend to jump classification a lot, and cars tend to stay classified as cars. another approach would be to change the ratio for dividing people vs. car classification. we suspect this will become more important when the foreshortening calibration calculation is actually used in classification. -ation -ation -ation.

type tracking improved

type history is implemented. the type of a worm can change during the video, but it's still recognized as the same worm. the final type of the worm is the average over the history of the worm. this doesn't work very well. everything ends up being categorized as a car. viewing the types updated frame-by-frame it doesn't appear this way though. further investigation is necessary.

the indicator box sometimes lags behind the moving blob. the blob joining might not be updating the blobs correctly. we haven't been able to find the documentation for the CopyEdges() function in OpenCV, but that's probably where the problem is.

worm tracking

worm tracking has been slightly improved. objects are tracked across the screen and not broken up into many smaller objects. there is even a part where two objects overlap, but they keep the correct object ID.
classification has not improved yet. the code has been written for foreshortening calculation. we are running tests to see the best threshold for the adjusted size.
unfortunately, we still have no way of posting the videos.

Monday, March 5, 2007

time to refine

we shot more footage over the weekend and ran our algorithm on it. this time we were in regents parking lot. the first set of videos was unusable because there were some branches in the way. we relocated and took a second set of videos which looked fine.
after running our current code on the second set, we found that the algorithm is currently making the assumption that car size would be the same. we had avoided using size in determining the type of a blob, but getting the blob in the first place relied on it being a particular size. because of this, we created some goals and compromises:
  1. better classification
    1. using object size with correction for foreshortening
  2. better object tracking
    1. be more forgiving when an objects class changes, call it the same object
    2. keep a history of the objects class at each frame and use that to update the overall class
  3. video will be shot from AP&M only until algorithm can be generalized

Monday, February 26, 2007

Movement detection, in graphs

Earlier, we decided to take into account the ratio of height and width when detecting the blob types. If the height's greater than the width, then it's most likely a person. Likewise, if the width's greater than the height, then the movement is most likely from a car. We've also decided to "merge" blobs--that is, take smaller blobs near each other and combine them.

Here is a graph of the movement detected from one of the AP&M videos:



(person, walking from the right hand side of the video to the left)



(car driving from the right to the left)

In both cases, the system detected the correct type of movement. However, the system tends to get confused often. It seems like the amount of white in the differencing might be too low to detect more than small blobs. We're correcting for this for the time being by assuming that each blob is 20x20 for the purpose of tracking (the original sizes are still used for type detection). We'll continue gathering data to determine the accuracy of our algorithm.

Wednesday, February 21, 2007

Improved object tracking

So it turns out that L*a*b* image differencing was implemented improperly. Among other things, the limited precision of the default OpenCV types made objects appear when they shouldn't. As a result, it looked like RGB differencing performed better. I made the necessary changes to fix the tracker, resulting in this video.

In particular, the L*a*b* image differencing now takes the absolute difference of all three channels, adds them together and then smooths the result using a 7x7 Gaussian kernel. The tracking now performs slightly better than the RGB version, but judging from the results, blob merging will actually be needed after all. From what it looks like, the white blobs that appear in the difference video aren't strong enough to be detected as contiguous blobs. Thresholding, using OpenCV, results in either lots of unintended noise or nothing at all. Reducing the blob detection threshold further than we already have (it's now 25, down from 100) should produce better results.

Also, Daniel and I went out to AP&M to perform foreshortening calculations and data collection. We were able to take video from both Glen Tesler's office and from the bridge between both sections of AP&M. This resulted in the following videos:

AP&M
AP&M (2)
AP&M (3)
AP&M (4)
AP&M (5)
AP&M (6)

[The videos that are only a second or so long are intended as snapshots of Daniel on the lawn in front of AP&M]

Monday, February 12, 2007

Blob tracking

We've converted our video diffing application to use L*a*b* (as opposed to RGB). The funny thing about this is that, at least with the video I used to test, last frame minus current worked better than first minus current. It might be due to the simplicity of the video, though. We've also implemented blob tracking. Here's the links to some sample output:

Original video
Video run through our application

We're going to work on optimization over this week, and implementing blob merging and foreshadowing correction (features that are not in the current version).

Friday, February 9, 2007

for capture

i captured some video to use for foreshortening calculating. the laptop and camera were on the 5th floor bridge of ap&m. ryan underwood watched the laptop while i went down and held up a green sign. there was a bit of wind, so the sign was hard to keep in a constant state.
i captured in ppm using this line:
ffmpeg -an -s 960x720 -vcodec ppm -f image2pipe foreshortening1.ppm
and converted the result to mp4 for viewing using this line:
ffmpeg -vcodec ppm -f image2pipe -i foreshortening1.ppm -vcodec mpeg4 foreshortening1.mp4
unfortunately the resulting video has skips in it and cuts off before the entire walk is done. it seems to have enough information to be usable. using this video we can count the number of pixels representing the poster and get a ratio for each y-line.
we should also be able to calculate this ratio if we know the height and distances and use 3d projection. i expect getting that information will be just as hard and more error prone.

here's the first video
and the second.

Monday, February 5, 2007

Ground truth and algorithms

On Saturday, Daniel and I worked on a couple of really important algorithms:
  1. An algorithm to combine blobs that are very close together/closely related--this is necessary to work around the poor performance of OpenCV's blob detection functionality in poor lighting conditions.
  2. An algorithm to track blobs across frames, assuming that algorithm 1 was run on each frame. This is so we can see the trajectory of people blobs towards different cars.
Speaking of people and car blobs, I also took some additional video on Saturday from the AP&M building (looking at the Faculty Club lot). Using Nick True's labeling program, I labeled several frames from that video. They're below:

groundtruth1.png
groundtruth2.png
groundtruth3.png

(car blobs are red, people blobs are green)

From the looks of things, car blobs and people blobs are extremely similar in size, at least from that vantage point. Simply looking at the size of each won't be enough, unless the sizes of each blob are clearly distinguishable. Perhaps we can track each blob and use its rate of movement to determine which category a particular blob belongs to.

Also, OpenCV's blob detection library seems slow for real-time use (70-90ms per frame according to gprof). For the time being, we won't worry about detecting in real-time.

Monday, January 29, 2007

movellan, guest, differences

i talked more with movellan last thursday. he's not as keen on working with us as he was before. he had been under the impression that we could work more directly on the robotics problem. he's encouraging though.
on friday i met with guest again. he gave me some code for ppm manipulation, like scaling.

i also washed my phone, so now it's even harder to talk with guest. he's more of a phone person and not so much on the email. we're going to meet again this friday. we were supposed to meet today, but he hasn't showed up.

our own code now has code for doing a diff from n frames ago, and for doing a running average difference. unfortunately video does not play on my laptop, so i'll have to test it later.

Wednesday, January 24, 2007

Blob detection library

I was Googling around and found a blob detection library that we can potentially use. The only thing is that it's written in Java; we'll need to port it to C first before we can try it with the video footage we've captured so far. The nice thing about Java is its similarity to C--there should be no problems porting it to the C language.

I'd also like to figure out what algorithm it's using, and compare it with the currently available algorithms out there. This will require reading the library code, though.

Friday, January 19, 2007

Video differences

We now have a video diff application. Here is a processed version of the clip posted previously. It looks pretty promising, and better than I expected. Right now, it compares all subsequent frames with the first frame captured from the camera, but we eventually want to diff from the previous frame instead (to account for changes in scenery/time of day).

The current code for the video diff program is on Subversion, at svn://svn.lifeafterking.org/cse190/videodiff/. To compile:
gcc -O2 -march=pentium4 -mtune=pentium4 -mmmx -g -o videodiff videodiff.c
To run:
ffmpeg -vcodec ppm -f image2pipe | ./videodiff | ffmpeg -vcodec pgm -f image2pipe -i - -vcodec mpeg4 [output file]
And there you have it. :)

Wednesday, January 17, 2007

Initial video capture

This was taken earlier this morning from the sixth floor of EBU1. Near the end of the video, it shows a van leaving a parking space and stopping in the middle of the parking lot. After, the driver gets out, puts something in the trunk, and gets back in. I preferred footage of someone walking to his/her car, but this works for the time being.

Anyways, off to class.

Tuesday, January 16, 2007

Initial footage from camera

So, it looks like the command for ffmpeg is actually incorrect with the version that I'm using. It should be:

ffmpeg -an -s 960x720 -vcodec ppm -f image2pipe test.ppm

or this for B&W:

ffmpeg -an -s 960x720 -vcodec pbm -f image2pipe test.pbm

(of course, replacing test.pbm/ppm with "-" to output to standard output)

Result (clicking on image goes to the full sized version):


Friday, January 12, 2007

schmoozing

i talked with professor Guest today at work. he is excited about the idea of finding out which parking spots were open using a webcam. he said he's been meaning to do a similar project on his own. work was rather slow and the bosses were away at lunch, so we didn't have much to do related to our jobs.
i also saw doctor Movellan last night at robotics. when Movellan heard that we only had 9 weeks, he said he didn't think that was enough time. He suggests that we use frame-by-frame difference to track moving objects.
ekiga will display video just fine, but i can't get ffmpeg, luvcvideo, and mplayer to do any video capture. i might just have to go straight debian. mooneer was able to get both my camera and his working. none of this ubuntu stuff.
i'm off to class, k c u later bye.

Forget DirectShow, we're using Linux

So, it turns out that there's actually a lot of stuff out there for Linux video/webcam capture. This stuff is also much easier than trying to deal with DirectShow. For instance, this shows a ffmpeg command that will convert the webcam input to a series of grayscale PGM images, one per frame. I can even use prerecorded video instead of the camera by using the filename of the video instead of /dev/video0. With some well-written scripts and the actual detection application, a GUI for the entire system isn't even needed.

Oh yeah, and Linux supports the Logitech camera using the uvcvideo driver. Excellent. :)

Anyways, we'll record some video for testing this weekend.