ad04145791b59754c6a78f4dac94efdf92ede89d
galt
  Wed Jun 19 15:29:48 2019 -0700
Updated geoIp mapping for our 3 official mirrors - RR, euro, asia with much better quality data from MaxMind.

diff --git src/hg/geoIp/README src/hg/geoIp/README
deleted file mode 100644
index 7bc30b8..0000000
--- src/hg/geoIp/README
+++ /dev/null
@@ -1,209 +0,0 @@
-NEW genome-asia!
-
-Our previous update was from 2011 when genome-euro was first made,
-so we are going to update everything with fresh data.
-
-RESTARTING update from ground zero,
-all tables backed up and cleared out on hgFixed
-
-wget 'http://dev.maxmind.com/static/csv/codes/iso3166.csv' -O country.csv
-
-To convert csv to tab-separated (removing quotes):
-cat country.csv | gawk -F "," '{print $1 "\t" $2}' | sed 's/"//g' > country.tab
-
-
-wget 'http://dev.maxmind.com/static/csv/codes/country_continent.csv' -O countryToContinent.csv
-To convert csv to tab-separated:
-cat countryToContinent.csv | gawk -F "," '{print $1 "\t" $2}' > countryToContinent.tab
-
-https://gist.github.com/nobuti/3816985
-Manually made continent.csv
-It seems odd that nobody has an easy to download cvs or tab-delimited continents list.
-To convert csv to tab-separated:
-cat continent.csv | gawk -F "," '{print $1 "\t" $2}' > continent.tab
-
-I have my data, so now I load the tables:
-
-This is a little script that runs autoSql on each .as to get the .sql files
-However these .sql files are not checked in to git since they are 
-so easily generated.
-
-./make-sql
-
-took the old gbNode.tab with US and euro,
-and added the japan record:
-vi gbNode.tab
-| 3    | genome-asia.ucsc.edu | JP          | Asian node of UCSC Genome Browser at the RIKEN institute of Japan | Asian Server  |
-changed "\" to tab char.
-
-made a script to create and load the .as tables
-some of them actually do not have any data yet,
-since the data is created by joins of other tables
-
-./load-tables
-
-note that because the script is so simple
-and some of the tables do not have .tab files,
-they will give errors. We will fill the empty
-tables later by inserting from a join of other tables.
-
-gbNode
-geoIpNode
-geoIpCountry
-country
-continent
-countryToContinent
-
-Check the integrity of the continent/country tables:
-
-forgot to strip off the 1st row of the csv file.
-Cleanup left-over header row.
-delete from countryToContinent where  continentId = '"continent code"';
-
-DONE maybe I need to delete A1 and A2 which are weird special categories
-that are not real countries and do not have an assigned continent which is given the value "--".
-| A1 | Anonymous Proxy      |
-| A2 | Satellite Provider   |
-These turn out not to be a problem.
-
-
-mysql> select * from countryToContinent where countryId not in (select id from country);
-+-----------+-------------+
-| countryId | continentId |
-+-----------+-------------+
-| AN        | NA          |
-| FX        | EU          |
-+-----------+-------------+
-
-AN Netherlands Antilles
-
-insert into country values ('AN', 'Netherlands Antilles');
-
-insert into country values ('FX', 'France, Metropolitan');
-
-mysql> select * from countryToContinent where continentId not in (select id from continent);
-+-----------+-------------+
-| countryId | continentId |
-+-----------+-------------+
-| A1        | --          |
-| A2        | --          |
-| O1        | --          |
-+-----------+-------------+
-
-insert into continent values ('--','not applicable');
-
-mysql> select count(*) from geoIpCountry where countryId not in (select id from country);
-+----------+
-| count(*) |
-+----------+
-|        0 |
-+----------+
-
---
-
-start off with crude continent-level assignments:
-
-make everything 1 by default
-
-insert into continentToNode select id, 1 from continent;
-
-Europe
-update continentToNode set node=2 where continentId = "EU";
-
-Asia
-update continentToNode set node=3 where continentId = "AS";
-
-Now we just can make a join
-
-insert into countryToNode (select countryId, node from countryToContinent cc, continentToNode cn where cc.continentId = cn.continentId);
-
-# fix ones requested by Bob to be included in euro instead of defaulting to america.
-# these are mostly around the middle east.
-update countryToNode set node=2 where countryId in ('AE','AM','BH','CY','GE','IL','IQ','JO','KW','LB','OM','PS','QA','SA','SY','YE');
-
-I compared the before and after updating with public_html/geoMap/test.csh at
- http://hgwdev.soe.ucsc.edu/~galt/geoMap/countryToNodeMap.html
-And it looked identical. So we did not lose anything.
-
-mysql> select distinct countryId from geoIpCountry where countryId not in (select distinct countryId from countryToNode);
-+-----------+
-| countryId |
-+-----------+
-| SS        |
-| CW        |
-| SX        |
-| BQ        |
-+-----------+
-
-mysql> select * from country where id in ('SS','CW','SX','BQ');
-+----+--------------+
-| id | name         |
-+----+--------------+
-| BQ | Bonaire      |
-| CW | Curacao      |
-| SS | South Sudan  |
-| SX | Sint Maarten |
-+----+--------------+
-
-
-So, what happened to these countries, and why are they not in countryToNode?
-Shall we add them?
-
-insert into countryToContinent values ('SS', 'AF');
-insert into countryToContinent values ('SX', 'NA');
-insert into countryToContinent values ('CW', 'NA');
-insert into countryToContinent values ('BQ', 'NA');
-
-insert into countryToNode values ('SS', '1');
-insert into countryToNode values ('SX', '1');
-insert into countryToNode values ('CW', '1');
-insert into countryToNode values ('BQ', '1');
-
-
-mysql> select distinct countryId from geoIpCountry where countryId not in (select distinct countryId from countryToNode);
-Empty set (0.13 sec)
-
-Re-ran test.csh and checked again:
-http://hgwdev.soe.ucsc.edu/~galt/geoMap/countryToNodeMap.html
-
-This means we should have full integrity.
-
-insert into geoIpNode (select ipStart, ipEnd, node from geoIpCountry gic, countryToNode ctn where gic.countryId=ctn.countryId);
-
-Query OK, 164390 rows affected (1.61 sec)
-Records: 164390  Duplicates: 0  Warnings: 0
-
-NOW IT IS READY!
-
-Note that in the future we may be putting other contries with other servers,
-like we did with the middle-east countries, overriding their default continent mappings
-in countryToNode.
-
-
-Not sure yet if I am going to do this again.
-It was what I did last time.
-The idea was to keep a backup of these important tables.
-That way perhaps hgFixed could be used when generating
-other sets of testing tables for redirect.
-
-TODO at the end:
-rename tables to real:
-gbNodeReal
-geoIpNodeReal
-
-
-Make an extra backup of just the git untracked files just in case:
-
-mkdir /hive/data/outside/geoIp/
-gitu | xargs -I X cp X /hive/data/outside/geoIp/
-
-
-Make a backup of the data too:
-
-mkdir /hive/data/outside/geoIp/geoIpTableDumps
-mkdir /hive/data/outside/geoIp/geoIpTableDumps/2016-06-23
-# sadly we need to temporarily make it write-able by all so mysqld can write dump there.
-chmod 777 /hive/data/outside/geoIp/geoIpTableDumps/2016-06-23
-hgsqldump hgFixed -T 2016-06-23 continent continentToNode country countryToContinent countryToNode gbNode geoIpCountry geoIpNode
-chmod 775 /hive/data/outside/geoIp/geoIpTableDumps/2016-06-23
-