You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When training our model to "see" screenshots, it's essential that it can see the screenshots at full resolution instead of massively downscaled screenshots, as the information gets lost much too quickly. For example, text written on a screen quickly becomes unreadable after just one 2x reduction. Therefore it's critical that our model can efficiently process these large states.
The difficulty in this problem comes from the fact that a "full screenshot" is compromised of ~9 million pixels (in the case of a 4k monitor), each having three features. In total, that'd be 27 million features per frame, which would let us fit 19 frames at a theoretical maximum of 500 million features per sample. As seeing only the past 19 frames is not feasible, we need to improve the memory efficiency of our model by feeding it something other than pure frames.
The text was updated successfully, but these errors were encountered:
When training our model to "see" screenshots, it's essential that it can see the screenshots at full resolution instead of massively downscaled screenshots, as the information gets lost much too quickly. For example, text written on a screen quickly becomes unreadable after just one 2x reduction. Therefore it's critical that our model can efficiently process these large states.
The difficulty in this problem comes from the fact that a "full screenshot" is compromised of ~9 million pixels (in the case of a 4k monitor), each having three features. In total, that'd be 27 million features per frame, which would let us fit 19 frames at a theoretical maximum of 500 million features per sample. As seeing only the past 19 frames is not feasible, we need to improve the memory efficiency of our model by feeding it something other than pure frames.
The text was updated successfully, but these errors were encountered: