Skip to content

Commit

Permalink
Merge pull request #10 from nielstenboom/development
Browse files Browse the repository at this point in the history
Enhancements
  • Loading branch information
nielstenboom authored Sep 26, 2020
2 parents 1260c3d + 0691e9e commit 16c9d62
Show file tree
Hide file tree
Showing 14 changed files with 187 additions and 188 deletions.
2 changes: 1 addition & 1 deletion .dockerignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
.git/
# videos/
videos/
76 changes: 0 additions & 76 deletions .github/workflows/docker-publish.yml

This file was deleted.

2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ annotations.csv
*.h5
*.mp4
*.p
videos

# Byte-compiled / optimized / DLL files
__pycache__/
Expand Down Expand Up @@ -130,3 +131,4 @@ dmypy.json

# Pyre type checker
.pyre/
.DS_Store
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,6 @@ RUN conda install python=3.6 -y && \
apt-get install libglib2.0-0 -y && \
apt-get install -y libsm6 libxext6 libxrender-dev -y && \
apt-get install ffmpeg -y && \
conda install faiss-cpu -c pytorch
conda install faiss-cpu=1.6.3 -c pytorch


71 changes: 63 additions & 8 deletions README.MD
Original file line number Diff line number Diff line change
Expand Up @@ -56,23 +56,68 @@ You can run the detector in a python program in the following way:
import recurring_content_detector as rcd
rcd.detect("/directory/with/season/videofiles")
```
This will run the detection by building the color histogram feature vectors. The feature vector function can also be changed:
This will run the detection by building the color histogram feature vectors. Make sure the video files you used can be sorted in the right alphabetical order similar as to when they play in the season! So episode_1 -> episode_2 -> episode_3 -> etc.. You'll get weird results otherwise.


The feature vector function can also be changed:
```python
# options for the function are ["CNN", "CH", "CTM"]
rcd.detect("/directory/with/season/videofiles", feature_vector_function="CNN")
```
This will CNN vectors, which are a bit more accurate but take much longer to build.

Because the videos need to be resized and the feature vectors saved in files, some artifacts will be created. On default they will be saved in the same directory as the video files, if you want them saved in a different directory:
The `detect` function has many more parameters that can be tweaked, the defaults it has, are the parameters I got the best results with on my experiments.

```python
rcd.detect("/directory/with/season/videofiles", feature_vector_function="CH", artifacts_dir="/tmp")
def detect(video_dir, feature_vector_function="CH", annotations=None, artifacts_dir=None, framejump=3, percentile=10, resize_width=320, video_start_threshold_percentile=20, video_end_threshold_seconds=15, min_detection_size_seconds=15):
"""
The main function to call to detect recurring content. Resizes videos, converts to feature vectors
and returns the locations of recurring content within the videos.
arguments
---------
video_dir : str
Variable that should have the folder location of one season of video files.
annotations : str
Location of the annotations.csv file, if annotations is given then it will evaluate the detections with the annotations.
feature_vector_function : str
Which type of feature vectors to use, options: ["CH", "CTM", "CNN"], default is color histograms (CH) because of balance between speed and accuracy. This default is defined in init.py.
artifacts_dir : str
Directory location where the artifacts should be saved. Default location is the location defined with the video_dir parameter.
framejump : int
The frame interval to use when sampling frames for the detection, a higher number means that less frames will be taken into consideration and will improve the processing time. But will probably cost accuracy.
percentile : int
Which percentile of the best matches will be taken into consideration as recurring content. A high percentile will means a higher recall, lower precision. A low percentile means a lower recall and higher precision.
resize_width: int
Width to which the videos will be resized. A lower number means higher processing speed but less accuracy and vice versa.
video_start_threshold_percentile: int
Percentage of the start of the video in which the detections will be marked as detections. As recaps and opening credits only occur at the first parts of video files, this parameter can alter that threshold. So putting 20 in here means that if we find recurring content in the first 20% of frames of the video, it will be marked as a detection. If it's detected later than 20%, then the detection will be ignored.
video_end_threshold_seconds: int
Number of seconds threshold in which the final detection at the end of the video should end for it to count. Putting 15 here means that a detection at the end of a video will only be marked as a detection if the detection ends in the last 15 seconds of the video.
min_detection_size_seconds: int
Minimal amount of seconds a detection should be before counting it as a detection. As credits & recaps & previews generally never consist of a few seconds, it's wise to pick at least a number higher than 10.
returns
-------
dictionary
dictionary with timestamp detections in seconds list for every video file name
{"episode1.mp4" : [(start1, end1), (start2, end2)],
"episode2.mp4" : [(start1, end1), (start2, end2)],
...
}
"""
```

Make sure the video files you used can be sorted in the right alphabetical order similar as to when they play in the season! So episode_1 -> episode_2 -> episode_3 -> etc.. You'll get weird results otherwise.

It will take some time as video processing takes quite some resources. An example application in production should run detections in parallel.


## Annotations

If you want to quantitively test out how well this works on your own data, fill in the [annotations](annotations_example.csv) file and supply it as the second parameter.
Expand All @@ -99,7 +144,17 @@ Detections for: episode3.mp4
Total precision = 0.862
Total recall = 0.853
```

## Tests

There's a few tests in the test directory. They can also be run in the docker container, make sure you creted a `videos` directory with some episodes in it:
```
docker run -it -v $(pwd):/opt/recurring-content-detector nielstenboom/recurring-content-detector:latest python -m pytest -s
```

## Credits
- https://github.com/noagarcia/keras_rmac for the CNN vectors
- https://github.com/facebookresearch/faiss for the efficient matching of the feature vectors

## Final words
If you use and like my project or want to discuss something related, I would ❤️ to hear about it! You can send me an email at [email protected].
18 changes: 2 additions & 16 deletions recurring_content_detector/__init__.py
Original file line number Diff line number Diff line change
@@ -1,19 +1,5 @@
from . import detector
from . import config

def detect(video_dir, annotations = None, feature_vector_function = "CH", artifacts_dir = None):

old_width = config.RESIZE_WIDTH

# make sure resize width of 224 is used with CNN
if feature_vector_function == "CNN":
config.RESIZE_WIDTH = 224

result = detector.detect(video_dir, feature_vector_function, annotations, artifacts_dir)

# set config variable back to the old value,
# so when reusing the module, there is no unexpected behavior.
config.RESIZE_WIDTH = old_width

return result
def detect(*args, **kwargs):
return detector.detect(*args, **kwargs)

11 changes: 0 additions & 11 deletions recurring_content_detector/config.py

This file was deleted.

90 changes: 60 additions & 30 deletions recurring_content_detector/detector.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@
from natsort import natsorted, ns

# internal imports
from . import config
from . import featurevectors
from . import video_functions
from . import evaluation
Expand Down Expand Up @@ -38,19 +37,26 @@ def fill_gaps(sequence, lookahead):
input: [0,0,1,0,0,0,0,1,0,0] with lookahead=6
output: [0,0,1,1,1,1,1,1,0,0]
"""

i = 0
while i < len(sequence) - lookahead:
current = sequence[i]
next = sequence[i + 1 : i + lookahead].tolist()

if current and True in next:
x = 0
while not next[x]:
sequence[i + 1 + x] = True
x = x + 1

i = i + 1

change_needed = False
look_left = 0
while i < len(sequence):
look_left -= 1
if change_needed and look_left < 1:
change_needed = False
if sequence[i]:
if change_needed:
for k in to_change:
sequence[k] = True
else:
change_needed = True
look_left = lookahead
to_change = []
else:
if change_needed:
to_change.append(i)
i+=1
return sequence

def get_two_longest_timestamps(timestamps):
Expand Down Expand Up @@ -133,7 +139,8 @@ def query_episodes_with_faiss(videos, vectors_dir):
return results


def detect(video_dir, feature_vector_function, annotations = None, artifacts_dir = None):
def detect(video_dir, feature_vector_function="CH", annotations=None, artifacts_dir=None, framejump=3, percentile=10,
resize_width=320, video_start_threshold_percentile=20, video_end_threshold_seconds=15, min_detection_size_seconds=15):
"""
The main function to call to detect recurring content. Resizes videos, converts to feature vectors
and returns the locations of recurring content within the videos.
Expand All @@ -150,6 +157,26 @@ def detect(video_dir, feature_vector_function, annotations = None, artifacts_dir
artifacts_dir : str
Directory location where the artifacts should be saved. Default location is the location
defined with the video_dir parameter.
framejump : int
The frame interval to use when sampling frames for the detection, a higher number means that less frames will be
taken into consideration and will improve the processing time. But will probably cost accuracy.
percentile : int
Which percentile of the best matches will be taken into consideration as recurring content. A high percentile will
means a higher recall, lower precision. A low percentile means a lower recall and higher precision.
resize_width: int
Width to which the videos will be resized. A lower number means higher processing speed but less accuracy and vice versa.
video_start_threshold_percentile: int
Percentage of the start of the video in which the detections will be marked as detections. As recaps and opening credits
only occur at the first parts of video files, this parameter can alter that threshold. So putting 20 in here means that
if we find recurring content in the first 20% of frames of the video, it will be marked as a detection. If it's detected
later than 20%, then the detection will be ignored.
video_end_threshold_seconds: int
Number of seconds threshold in which the final detection at the end of the video should end for it to count.
Putting 15 here means that a detection at the end of a video will only be marked as a detection if the detection ends
in the last 15 seconds of the video.
min_detection_size_seconds: int
Minimal amount of seconds a detection should be before counting it as a detection. As credits & recaps & previews generally
never consist of a few seconds, it's wise to pick at least a number higher than 10.
returns
-------
Expand All @@ -161,14 +188,19 @@ def detect(video_dir, feature_vector_function, annotations = None, artifacts_dir
...
}
"""

# if feature vector function is CNN, change resize width
if feature_vector_function == "CNN":
resize_width = 224

print("Starting detection")
print(f"Framejump: {config.FRAMEJUMP}")
print(f"Video width: {config.RESIZE_WIDTH}")
print(f"Framejump: {framejump}")
print(f"Video width: {resize_width}")
print(f"Feature vector type: {feature_vector_function}")

# define the static directory names
resized_dir_name = "resized{}".format(config.RESIZE_WIDTH)
feature_vectors_dir_name = "{}_feature_vectors_framejump{}".format(feature_vector_function,config.FRAMEJUMP)
resized_dir_name = "resized{}".format(resize_width)
feature_vectors_dir_name = "{}_feature_vectors_framejump{}".format(feature_vector_function,framejump)

# the video files used for the detection
videos = [f for f in os.listdir(video_dir) if os.path.isfile(os.path.join(video_dir, f))]
Expand Down Expand Up @@ -197,12 +229,12 @@ def detect(video_dir, feature_vector_function, annotations = None, artifacts_dir
# if there is no resized video yet, then resize it
if not os.path.isfile(file_resized):
print("Resizing {}".format(file))
video_functions.resize(file_full, file_resized)
video_functions.resize(file_full, file_resized, resize_width)

# from the resized video, construct feature vectors
print("Converting {} to feature vectors".format(file))
featurevectors.construct_feature_vectors(
file_resized, feature_vectors_dir_name, feature_vector_function)
file_resized, feature_vectors_dir_name, feature_vector_function, framejump)

# query the feature vectors of each episode on the other episodes
results = query_episodes_with_faiss(videos, vectors_dir)
Expand All @@ -212,17 +244,15 @@ def detect(video_dir, feature_vector_function, annotations = None, artifacts_dir
total_detected_seconds = 0
total_relevant_detected_seconds = 0

framejump = config.FRAMEJUMP

all_detections = {}
for video, result in results:
framerate = video_functions.get_framerate(os.path.join(video_dir, video))
threshold = np.percentile(result, config.PERCENTILE)
threshold = np.percentile(result, percentile)

# all the detections
below_threshold = result < threshold
# Merge all detections that are less than 10 seconds apart
below_threshold = fill_gaps(below_threshold, int((framerate/config.FRAMEJUMP) * 10))
below_threshold = fill_gaps(below_threshold, int((framerate/framejump) * 10))

# put all the indices where values are nonzero in a list of lists
nonzeros = [[i for i, value in it] for key, it in itertools.groupby(
Expand All @@ -236,13 +266,13 @@ def detect(video_dir, feature_vector_function, annotations = None, artifacts_dir
start = nonzero[0]
end = nonzero[-1]

#result is in first 20% of the video
occurs_at_beginning = end < len(result) / 5
#the end of this timestamp ends in the last 15 seconds
ends_at_the_end = end > len(result) - 15 * (framerate/framejump)
#result is in first video_start_threshold% of the video
occurs_at_beginning = end < len(result) * (video_start_threshold_percentile / 100)
#the end of this timestamp ends in the last video_end_threshold seconds
ends_at_the_end = end > len(result) - video_end_threshold_seconds * (framerate/framejump)

if (end - start > (15 * (framerate / framejump)) #only count detection when larger than 15 seconds
and (occurs_at_beginning or ends_at_the_end)): #only use results that are in first 1/5 part or end at last 15 s
if (end - start > (min_detection_size_seconds * (framerate / framejump)) #only count detection when larger than min_detection_size_seconds seconds
and (occurs_at_beginning or ends_at_the_end)): #only use results that are in first part or end at last seconds

start = start / (framerate / framejump)
end = end / (framerate / framejump)
Expand Down
Loading

0 comments on commit 16c9d62

Please sign in to comment.