Efficient Model #2

ClashLuke · 2022-06-26T17:06:01Z

When training our model to "see" screenshots, it's essential that it can see the screenshots at full resolution instead of massively downscaled screenshots, as the information gets lost much too quickly. For example, text written on a screen quickly becomes unreadable after just one 2x reduction. Therefore it's critical that our model can efficiently process these large states.
The difficulty in this problem comes from the fact that a "full screenshot" is compromised of ~9 million pixels (in the case of a 4k monitor), each having three features. In total, that'd be 27 million features per frame, which would let us fit 19 frames at a theoretical maximum of 500 million features per sample. As seeing only the past 19 frames is not feasible, we need to improve the memory efficiency of our model by feeding it something other than pure frames.

ClashLuke added the ml requires in-depth machine learning knowledge label Jun 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Efficient Model #2

Efficient Model #2

ClashLuke commented Jun 26, 2022

Efficient Model #2

Efficient Model #2

Comments

ClashLuke commented Jun 26, 2022