Overwatch League Video Replay Teamfight Efficiency Analysis¶

SIADS 691 Sports Analytics (Winter 2021)¶

Tim Chen (ttcchen@umichedu)
March 26, 2021

Background and Motivation¶

Overwatch is a 6v6 action game that emphasizes strategy and teamwork. While individual skills are still important, its effect is significantly less than games like Counter-Strike: Global Offensive. Being primarily an objective-based game, individual performance metrics like damage done, eliminations, accuracy, or kill/death ratio often cannot be directly compared and do not represent actual outcomes of matches.

I pursued this project with the motivation of being able to analytically compare how different teams or players perform, and to see if certain trends or insights could be obtained from the game.

Official Data¶

The Overwatch League provides a few tableau dashboards to visualize data such as teams, players, and teamfights information. Blizzard, the company which made Overwatch, actively suppresses the availability of these metrics in (minimal comparison in-game) and out of game (only few aggregated metrics from a few selected players are shown post-game) to discourage players from using them as performance evaluation (and to tone down the teammate blaming). This trend is no different for Overwatch League games, where only some aggregated data from individual matches are available.

For reference, the dashboards and data is available at https://overwatchleague.com/en-us/statslab, however, none of it is used for this project.

Overwatch League Replay Viewer¶

Although more fine-grained data is unavailable for play-by-play game data, the Overwatch League does have a replay viewer, allowing replaying the game via the game client. This will be used to extract precisely timed data during the games. The replay viewer allows free roam, first-person and third-person view.

Replay Viewer

The replay viewer in free roaming mode, allowing spectators to get a good view of the fight

Map and Game Selection¶

Many games and maps are played for Overwatch, due to the video recording and post-processing time required, a 10 minute game would require around 6-7 hours of processing time. Ultimately, the following criteria are chosen:

Limited to OWL 2020 Playoffs, as the games are all played on the same patch and within a 3 day span.
Limited to King of the Hill maps, which does not have the home/away bias that other game modes does.
Limited to the "Busan" map, as it was played in all 6 of the playoff games.

The end result is a total of 6 games being analyzed, ranging between 10-15 minutes each. However, one specific game, The OWL 2020 Playoffs Losers' Finals between Seoul Dynasty and Shanghai Dragons, had multiple processing failure which remained unresolved at the time of writing this report, resulting in the rest of the report basing on these 5 maps:

Oct 8th, 2020. Semi-Finals between Seoul Dynasty and San Francisco Shock
Oct 8th, 2020. Semi-Finals between Philadelphia Fusion and Shanghai Dragons
Oct 9th, 2020. Winners' Finals between San Francisco Shock and Shanghai Dragons
Oct 9th, 2020. Losers' Round 1 between Seoul Dynasty and Philadelphia Fusion
~~Oct 9th, 2020. Losers' Finals between Seoul Dynasty and Shanghai Dragons~~
Oct 10th, 2020. Grand Finals between Seoul Dynasty and San Francisco Shock

OWL S3 Playoffs

OWL 2020 Season Playoffs Bracket. Source: liquipedia

Extracted Video¶

Here's a short extract from how the captured video looks like. The playback speed is 2x and captured at 60 frames per second, without sound, and in 3rd person perspective. The graphics details are also dialed down to the lowest possible but maintaining 1080p resolution. OBS Studio and a series of keyboard macros were used to automatically capture and loop the replays between all 12 players, resulting in a 1 hour video for a 10 minute match.

Brief excerpt of Diem of the Shanghai Dragons popping off against the Seoul Dynasty from the captured video

Synchronized POV of all 12 Players¶

Although this verification was not required, automatically synchronize the videos between all players was done and a sample freeze frame image is provided to demonstrate this.

Synchronized frames between all 12 players

And the synchronized video

Extracting Data¶

With the video captured, frame by frame analysis can be done.

Round Status¶

Each King of the Hill map uses a first-to-two scoring system. Each round consisted of the following state:

State	Description	Image
Countdown to round start	At the beginng of a round, there is a 30 second countdown before players can leave their spawning area.
Countdown to objective unlock	Once the round starts, there is aother 30 second countdown until the objective can be captured.
Objective unlocked	Once the objective is unlocked, the teams will fight for control to try to get to 100% progress. In this snapshot, the white team is currently in control of the objective.
Overtime	When the progress for the team in control has reached 99%, if the opposing team is actively contesting the objective, the round will enter overtime state. No further progress will be made until either team stops contesting.
End of round	The round ends when one of the teams achieves 100% progress

Round Status Determination

To determine the current state of a round based on each image frame, OpenCV's matchTemplate() was used against these templates. If none of the templates matches, the round is currently in progress with the objective unlocked.

Team Progress Determination

Initially PyTesseract was used to attempt to recognize the team progress, however, the accuracy was very low. An alternate strategy was used to count how long each team was in control for. This was achieved by the following steps:

Grab a few pixels in the UI near the progress scores, excluding the score itself
Determine which team was in control based on the brightest RGB channel value and consistency between pixels, a positive match would be a tight distribution of RGB values between pixels while having a channel above a certain threshold, while the other team has much darker RGB values.
Count the number of frames each team was in control, 100% progress equates to 2 minutes in control which translated to 3600 frames in this setup

Hero Selection¶

OpenCV's matchTemplate() was used against these templates. With only a limited number of heroes being played during the playoffs, these partial templates were used to determine each player's hero selection, and when they swapped heroes. Hero selection is necessary as each hero has different health pool and abilities.

Health and Ultimate Charge¶

Health and ultimate charge values are available at set positions in the UI. While the health value is always represented as a number, when a player's ultimate is fully charged, the number turns into an icon. OpenCV's matchTemplate() was used to check if a checkmark is present in the portraits bar to check whether an ultimate is charged.

Current health and ultimate charge shown as a number

Ultimate shown as icon when fully charged

All health and ultimate charge numbers are extracted out stored for further decoding.

Initial attempt: PyTesseract

Initially, an attempt to using OCR to recognize the digits was attempted with PyTesseract, however, even when specifying a numeric character subset, the results were poor. With the noisy backgrounds and a font with a few very similar looking numbers, OCR performed poorly (accuracy less than 60%).

Image Pre-processing Improvements

Improvements to the OCR performance were attempted by improved image pre-processing, binarizing the background and foreground did help with the results, however, it was still unsatisfactory (sub 80%), with numbers like 3 / 5 / 6 / 9 and 8 / 0 still getting mixed up. A combination of thresholding, skewing, dilation, erosion, noise removal, and flood filling was used to achieve better digit separation.

Template Matching

With OCR's poor performances, OpenCV's template matching was explored and provided some hope. With some image pre-processing, digits were extracted from the numbers and templates for each number was generated.

Sample images of digits used for template matching

The digit matching gave decent results, was it was the first time that individual digits were matched with an accuracy above 95%. However, problems still persisted with certain sets of numbers.

Keras Digit Classifier

Use of a Keras digit classifier model was next attempted, with decent results from template matching, a model was trained with 12,000 digit samples labeled from the earlier results. The model performed perfectly, achieving a 100% accuracy for digit classification. However, some problems still prevented it from being the final solution, namely, the digits were not extracted properly from the images.

As shown in this health value time-series, even with perfect digit classification, some numbers were not matched correctly

Due to the high variance of colors in the background, performing threshold and floodfill transformations were problematic.

A few examples of digits failing to recognize

Examining the histograms of these problematic examples revealed that dynamic thresholding was needed in order binarize the input images better.

Histograms demonstrating the effect noisy backgrounds on pixel values.

With dynamic thresholding, digit seperation performed performed much better on this particular video.

With the improved results, the model was applied to other videos. Unfortunately, the results were inconsistent still. The image pre-processing proved too difficult given the variance in the input data.

Keras Number Classifier

Since we were able to have good results from one video, we managed to obtain a large number (50,000+) of samples with their values labeled correctly. With little to lose, 2 neural network number classifiers, one for health (up to 3 digit number) and another for ultimate charge (up to 2 digit number), were trained to see how they performed, the samples were pre-processed only as grayscale images. Although some errors still persisted, the end result was excellent after a two-frame confirmation outlier removal was performed. Theses models were used in the following analysis.

Identifying Teamfights¶

Different stages of teamfights (What is a Teamfight?) can now be identified from the data. A teamfight usually includes the following stages:

Poke phase - when both teams are not committed to fighting but are exchanging light blows from distance, waiting for advantageous positioning and timing
Teamfight phase - when both teams are committed in a fight
Cleanup phase - when a team has come out on top, and is cleaning up remaining enemies or taking up position for the next fight

A team health based threshold could provide a good baseline for teamfights. A simple algorithm combining percentage health remaining for players, eliminated players, as well as gradual ease in factor after respawn resulted in the following model.

Simple teamfight detection model based on only player health turned out to be very good

The simple model exceeded expectations and matched up very well to manual identification. This model will be used as is for the following analysis.

Data Visualization¶

With the data collected and teamfights identified, overviews for each round was generated.

Match	Round 1 Data	Round 2 Data	Round 3 Data
Semi-Finals: Seoul Dynasty vs San Francisco Shock
Semi-Finals: Philadelphia Fusion vs Shanghai Dragons
Winners' Finals: San Francisco Shock vs Shanghai Dragons
Losers' Round 1: Seoul Dynasty vs Philadelphia Fusion
Losers' Finals: Seoul Dynasty vs Shanghai Dragons	Unavailable	Unavailable	Unavailable
Grand Finals: Seoul Dynasty vs San Francisco Shock

Click to thumbnail to see full image

Teamfight Analysis¶

Unfortunately, there were fewer than 100 teamfights in total through the analysis of these 5 matches. The grounds for drawing conclusion from the analysis below is weak, but it should serve as a demonstration of what is possible with the data extracted.

First Fight Win Rate¶

For the first teamfight, teams are on even ground. Let's see if certain teams do better than others.

With only 11 first fights, the outcome is split quite evenly

First Kill Win Rate¶

Often, teams with the first kill are more likely to win the teamfight. Let's see if this is the case.

Look at Seoul Dynasty winning all fights that they had first kill!

Oh, they only had first kill four times, less than 15% of team fights

Ultimate Advantage¶

There is a strong believe that the team having ultimate advantage has a significantly higher chance of winning the next team fight, let's see if it is the case.

There appears to be insufficient data to spot a trend

Teamfight Duration¶

Let's explore if there are any relationship between teamfight length and win rates for teams.

Shanghai Dragons might want to work on their shorter fights

Future Works¶

Improvements can be made to the current state of the project, namely:

The Keras number classification model could definitely be more efficient
The current video capture process still requires too much manual input and validation
I was unable to extract individual hero ability usage and cooldowns, that should be significantly interesting when extracted
The teamfight detection could be more precise with more data as reference

Using computer vision to extract data and insights from Overwatch replays are technically feasible. However some limitations exists, for example, it's currently not possible to track damage source, and certain animations (D.Va calling down her mech) blocks out the UI completely for a split second.