Football, one of the most popular sports on earth, draws millions of fans every time a game is played: 22 players compete for possession of the ball.
The truth is that watching football games isn’t the whole story, and if we examine how much data we’re able to gather from a game, we might be able to understand why.
As the author of a contribution to this subproblem of วิเคราะห์บอล, I can share my experiences with understanding as much information as possible from television-like video feeds of football matches.Something is bothering you
Despite the fact that you could place multiple fixed cameras around the field, it is difficult to extract positional and semantic information from a moving camera. It would be difficult to achieve that in a real stadium due to budget and permission constraints.
There are a variety of ways to process video data if you are on a budget and do not want to leave your chair.How to approach the issue
The approach we took was to break the task into smaller, more manageable and more specific pieces, just like any textbook (good) programmer would do.
The following divisions were formed as a result:
- Project the positions of the players onto a two-dimensional space, based on the camera view (reference estimation and homography estimation).
- Player, ball, and official identification (e.g., who they are).
- It is crucial to my project to track objects (also called entities).
- Can the players be identified between frames? Do I have any options for identifying them?
- What team does a player play for (how can I find that out).
After examining the design of the overall system, we can then move on to examining the specific tasks, such as positional and semantic data.
After each frame sequence is received, it is processed according to the fields and entities detected (object detection). Once nearly consecutive events are detected, we begin tracking each entity. The position of each entity relative to the camera is also projected as we estimate the field’s location relative to the camera. We are also able to track each player’s performance by identifying him and placing him within a team.
Afterwards, we repeat the video frame by frame until it ends. The next step is to smooth the data. To smooth the data, we look back at the data we have collected frame by frame and perform “backward adjustments” to ensure trajectory detections and trajectory paths stay similar over the sequence.
As soon as a frame is fed into the system, you can see the steps that occur inside it.A method for detecting objects
From a machine learning perspective, one of the first things one notices is how hard it is to locate good quality labeled data. Among object detectors, YoloV3 is among the most popular.
When the frame is cropped and the pre-trained net is used, the results will be disappointing. We used YOLO to feed the original resolution image across the network since accuracy was more important than speed. Using this method, you will be able to tell when a ball is near a player or referee.