Aoimirai - Kpop tools Full Disclaimers

As with any data mining, aggregation and statistics, there are plenty of disclaimers on how data are acquired and treated, as well proposital limitations or characteristics of the system. As much I would like to leave every single detail available as soon as possible, it would make the side pretty unreadable with so many fine print and disclaimers everywhere, so this page contains all of them.

Artist database

Artists are added manually as they become famous, have significant sales or MV views, or someone report a missing one by mail. General data is gathered from wikipedia and fansites. If you find anything wrong with an artist data, such as name, debut video, social medias and so on, mail me. I do not monitor new profiles and accounts for artists (other than those posted in r/kpop) so with time I will mostly have outdated information about their social medias.

Some small (usually only one MV) sub-units or soloists are not added separately and instead are listed inside the main group for simplicity sake. The biggest example of such move is with LOOΠΔ and the pre-debut sub-units. Some small collabs between artists are also often not added because there are too many combinations that have just one song and wouldn't make much sense to add as a whole new artist, exceptions are usually when one of the collaboration artists is noteworthy enough that allowing the views/sales of the collaboration into the artist makes sense. Example is Soyou, with her big plethora of collaborations.

As a side note, while we do not allow videos not hosted by official channels, some debut videos that were released prior to Korean studios posting their releases in Youtube, and which were never officially posted, might be linked to a non official account if that is the only available video. The Debut system also uses a date correction in case the date of the debut video is not the release date (it was posted later, therefore the date in Youtube is not when the video was originally released).

Its worth mentioning that Korean artists that move out of Korea and start a solo carer in a foreign country, in foreign language, will not be counted towards their Kpop profile. A good example is Tiffany after she left SM, since she moved to the US, signed with an US label and releases America-oriented English-language musics, therefore not Kpop. Here is a list of artists that are NOT included on this site:

  • Tiffany Young (after moving to America)
  • Kris Wu (after moving to China)
  • Z-Girls (is not K-pop, self declared "international pop" that just happens to be based in Korea)
  • Z-Boys (same)
  • Jackson Wang (China)

Youtube data (views and likes)

Videos are added manually as they are released, a new artist is added to the database, or reported as missing. Since I maintain this site alone, new releases might take a few days to be registered. Also please allow a few days for me to catch up on reports and mails if you send one. Videos with very little view count are added depending on their significance (debuts, only available video for an artist, etc..) but might be left out, same thing with alternate versions. 

The reason we do not add "Mirrored" (unless the only available dance practice and on an official channel) is that these videos are used for fans learning/practicing the choreography and are often replayed a lot for that reason, causing such videos to have a huge view count for different reasons than usual MV's or even choreography videos. 

Views are gathered from the main cron bot, which crawls Youtube gathering views and likes every 15 seconds. Since there are about 4000 videos to fetch data, and the bot averages 180 crawls per hour, it currently cycles the whole database in one day - however, priority videos (recent releases, high view/day) get updated more frequently (every 6 hours).

At the first day of the month, the bot prioritizes videos and artists on the top lists to get the most accurate top list around 18:00 UTC for the Top History page.

Videos that are deleted get flagged and are never updated again - One should remind that some times a video is temporarily disabled but returns later, and since I cannot check that, there is a small possibility that a video marked as removed ends up returning. From time to time I check some of these videos to see if it were only temporarily disabled and are back up. From time to time I will check these removed videos to check if they were re-allowed, but that is rare.

Note that the bot only reads the HTML file from Youtube, fetching views and likes, not even touching media files and therefore not even starting a view.


Sales are calculated as a sum of data (hard to find) prior to 2011, and data starting 2011 when GAON started. 

For sales PRIOR to 2011, dubbed the MIAK era (Music Industry Association of Korea), data are hard to find and usually requires digging into archives and fan-pages. Fans often update Wikipedia with some data but few artist pages are fully accurate and updated. The initial MIAK data was gathered from Wikipedia, with other sources used for some artists depending on how readily available and trustworthy they are. While for the GAON period no HANTEO data is used, there is a big chance some of the MIAK era data comes from HANTEO. Whenever possible, how MIAK era data were gathered is displayed when you click "Sales composition" in an artist.

GAON is a different method than HANTEO, whereas GAON use physical copies shipped from distribution lines, not sales to end users, HANTEO use real-time reports from stores about sales to end users. For instance, when a new debut happens, the studio will order a certain number of physical copies to be distributed to stores. Stores will then purchase these as they see fit (might not purchase all printed copies). GAON adds the copies that were shipped to stores, regardless if they end up being sold or not. Because of this, as time goes by and stores return unsold items, GAON corrects their data accordingly. GAON is known to be an accurate source for older releases because of that, but since they do not have current sales, they are a terrible source for debuts and new releases - thus TV shows, Awards and such use HANTEO data for up-to-date sales. Sometimes GAON data is used for end-of-year awards.

Unfortunately, GAON is far from accurate and in my experience they are very disorganized on their own. Other than their internal disorganization, a common problem between GAON and HANTEO (less with GAON) is that fans usually buy releases in bulk to try and influence shows and awards, and then RETURN those copies for a full refund. Both GAON and HANTEO then have to update the sales down, causing sales often to go down (this happens faster with HANTEO since its real-time, but happens about the same with GAON anyway). The problem is that GAON do not report these updates on real time, and on top of that, they only report the top 100 sales each month - therefore, small numbers that can end up having a big meaning month after month are not shown. Their end-of-year report, which do contain these updates, also only show the top 100, so we still miss a lot of tweaking. Therefore, even raw GAON data is not reliable.

GAON sales in this site are gathered from a bot that sums all sales from GAON monthly, then use GAON yearly to detect and correct updates. This is the most accurate one can get from GAON data, but as explained above, it has plenty of limitations. Also, since we need the yearly report to correct data, all sales for the current year should be taken with a grain of salt. They could be incorrect and we need to wait the Yearly report to correct it.

Only physical copies sold in Korea are counted.

Other tools

The "This day in Kpop" uses the date videos were published - usually in Korean time. The "This week" feature uses the week number a video was updated, taking into consideration the first and last weeks of an year can overlap with the adjacent year (so the first week of 2018 contains the last week of 2017 and so on).

While debut videos are marked, notice that not all debuts had videos, so some artists might not have debuts marked on the site. Also, most artists that debuted before 2005 don't have a MV available (some do, but its rare). Debut videos that were posted at a later date (officially or by fans) have a "corrected" date so they appear on the correct historical date, not when the video was posted, as a Debut (but the video date is retained for average view/days)

The Top 30 history is gathered every first day of the month around 18:00 UTC automatically (for a small period it might show doubled because of an auto-backup) ever since May 2017, but due to a bug on the data gathering, Artist totals were incorrect and thus discarded up to May 2018. For that reason, for top MV's we have data starting May 2017, and for top Artists, only from May 2018.

Supporting the site

This site is totally non-profit and the only money making capabilities were ads (which were small, non-intrusive and as of December 2018 were removed since they were totally useless, 2 years and I have not reached the US$ 100 goal for payment) and Tipping using Pay Pal. While these appear on all site (I run 2 domains, and both have more than just Kpop) I am yet to have received more than US$50 from donations. The costs of running the server are about US$15 per month, so I pay to keep the site up even if I add all donations and ad revenue. I also often request help from the community (usually via Twitter) to improve the site or the database, but it is extremely rare for someone to help, and a lot more common to people to nitpick problems, so I also don't get any non-monetary help to keep this up. Therefore, I ask some patience towards lacking artists, mvs or features, since this is a hobby site that are maintained only by me.

Ads by google (click here to hide, consider tipping me to maintain the site)
This year donations/tips (click to tip/donate): $3, 2018 donations: $36, Server cost yearly: $180