region sizes #11

PatMyron · 2021-06-30T03:34:52Z

https://ip-ranges.amazonaws.com/ip-ranges.json

https://old.reddit.com/r/aws/comments/j3luvy/can_anyone_tell_me_or_send_me_documentation_on/g7dl4ip/

from collections import defaultdict
import requests
prefixes = requests.get('https://ip-ranges.amazonaws.com/ip-ranges.json').json()['prefixes']
regions = defaultdict(lambda: 0)
sum = 0
for prefix in prefixes:
  mask = prefix['ip_prefix'].split('/')[1]
  regions[prefix['region']] += 2**(32-int(mask))
  sum += 2**(32-int(mask))
for region in regions:
  print(region + ": " + str(round(regions[region] / sum, 2)))
print('total:', sum//1000000, 'million')

us-east-1: 28%
us-west-2: 15%
eu-west-1: 09%
us-east-2: 07%
ap-northeast-1: 06%
eu-central-1: 06%
GLOBAL: 04%

total: 124 million

The text was updated successfully, but these errors were encountered:

PatMyron · 2021-09-20T01:49:49Z

https://www.gstatic.com/ipranges/cloud.json

from collections import defaultdict
import requests
prefixes = requests.get('https://www.gstatic.com/ipranges/cloud.json').json()['prefixes']
regions = defaultdict(lambda: 0)
sum = 0
for prefix in prefixes:
  try:
    mask = prefix['ipv4Prefix'].split('/')[1]
  except:
    pass
  regions[prefix['scope']] += 2**(32-int(mask))
  sum += 2**(32-int(mask))
for region in regions:
  print(region + ": " + str(round(regions[region] / sum, 2)))
print('total:', sum//1000000, 'million')

us-central1: 25%
us-east1: 10%
europe-west1: 9%
asia-east1: 5%
us-west1: 5%
asia-northeast1: 4%
us-east4: 4%
global: 5%

total: 9 million

PatMyron · 2021-09-21T03:24:11Z

https://www.microsoft.com/en-us/download/details.aspx?id=41653

https://download.microsoft.com/download/0/1/8/018E208D-54F8-44CD-AA26-CD7BC9524A8C/PublicIPs_20200824.xml

from collections import defaultdict
from xml.etree import ElementTree
import requests

regions = defaultdict(lambda: 0)
sum = 0
for region in ElementTree.fromstring(requests.get('https://download.microsoft.com/download/0/1/8/018E208D-54F8-44CD-AA26-CD7BC9524A8C/PublicIPs_20200824.xml').text):
  for cidr in region:
    mask = cidr.attrib['Subnet'].split('/')[1]
    regions[region.attrib['Name']] += 2**(32-int(mask))
    sum += 2**(32-int(mask))
for region in regions:
  print(region + ": " + str(round(regions[region] / sum, 2)))
print('total:', sum//1000000, 'million')

useast: 13%
europewest: 11%
uswest: 10%
useast2: 7%
europenorth: 7%
uscentral: 6%
ussouth: 6%
asiasoutheast: 4%
uswest2: 4%

total: 16 million

#11

PatMyron · 2022-05-30T14:59:58Z

I should consider switching the Azure source:

https://twitter.com/0xdabbad00/status/1275821557785309184

https://download.microsoft.com/download/7/1/D/71D86715-5596-4529-9B13-DA13A5DE5B63/ServiceTags_Public_20220523.json

https://www.microsoft.com/en-us/download/details.aspx?id=56519

from collections import defaultdict
import requests
prefixes = requests.get('https://download.microsoft.com/download/7/1/D/71D86715-5596-4529-9B13-DA13A5DE5B63/ServiceTags_Public_20220523.json').json()['values']
regions = defaultdict(lambda: 0)
sum = 0
for prefixList in prefixes:
  for prefix in prefixList['properties']['addressPrefixes']:
    mask = prefix.split('/')[1]
    try:
      regions[prefixList['name'].split('.')[1]] += 2**(32-int(mask))
      sum += 2**(32-int(mask))
    except:
      pass
    # sum += 2**(32-int(mask))
for region in regions:
  print(region + ": " + str(round(regions[region] / sum, 2)))
print('total:', sum//1000000, 'million')

PatMyron · 2022-05-30T15:50:30Z

@0xdabbad00 most CIDRs seem listed multiple times in that newer Azure IP range file.. listed by service and then listed again by service + region

Confirmed no duplicates in GCP's data, found some duplicates in AWS' data that need to be handled too, still some duplicates in Azure's CIDR ranges even after only counting service + region data

PatMyron · 2022-05-31T22:06:16Z

assuming @seligman dealt with duplicates since his repos arrive at similar GCP/Azure totals after I handled most Azure duplicates, but @seligman has just over half as many AWS IP addresses as my unhandled AWS total

seligman · 2022-05-31T23:06:53Z

Yep, there's a lot of overlap in AWS's ip-ranges, you need to account for it. For instance, both of these entries exist in ip-ranges.json, but they're clearly for the same exact IP addresses.

{"ip_prefix": "52.219.170.0/23", "region": "eu-central-1", "service": "AMAZON", "network_border_group": "eu-central-1"}
{"ip_prefix": "52.219.170.0/23", "region": "eu-central-1", "service": "S3", "network_border_group": "eu-central-1"}

There is also some overlap that's not quite as obvious, one larger example is:

{"ip_prefix": "35.180.0.0/16", "region": "eu-west-3", "service": "AMAZON", "network_border_group": "eu-west-3"}
{"ip_prefix": "35.180.0.0/16", "region": "eu-west-3", "service": "EC2", "network_border_group": "eu-west-3"}
{"ip_prefix": "35.180.1.16/29", "region": "eu-west-3", "service": "ROUTE53_RESOLVER", "network_border_group": "eu-west-3"}
{"ip_prefix": "35.180.1.24/29", "region": "eu-west-3", "service": "ROUTE53_RESOLVER", "network_border_group": "eu-west-3"}
{"ip_prefix": "35.180.1.32/29", "region": "eu-west-3", "service": "ROUTE53_RESOLVER", "network_border_group": "eu-west-3"}
{"ip_prefix": "35.180.1.40/29", "region": "eu-west-3", "service": "ROUTE53_RESOLVER", "network_border_group": "eu-west-3"}
{"ip_prefix": "35.180.1.48/29", "region": "eu-west-3", "service": "ROUTE53_RESOLVER", "network_border_group": "eu-west-3"}
{"ip_prefix": "35.180.1.56/29", "region": "eu-west-3", "service": "ROUTE53_RESOLVER", "network_border_group": "eu-west-3"}
{"ip_prefix": "35.180.1.8/29", "region": "eu-west-3", "service": "ROUTE53_RESOLVER", "network_border_group": "eu-west-3"}
{"ip_prefix": "35.180.112.128/27", "region": "eu-west-3", "service": "ROUTE53_RESOLVER", "network_border_group": "eu-west-3"}
{"ip_prefix": "35.180.112.160/27", "region": "eu-west-3", "service": "ROUTE53_RESOLVER", "network_border_group": "eu-west-3"}
{"ip_prefix": "35.180.112.80/29", "region": "eu-west-3", "service": "EC2_INSTANCE_CONNECT", "network_border_group": "eu-west-3"}
{"ip_prefix": "35.180.244.0/23", "region": "eu-west-3", "service": "AMAZON", "network_border_group": "eu-west-3"}

All of the ranges in that list are either whole or in part included in the first range in the list.

To account for this, the code in my stuff uses netaddr:

from collections import defaultdict
import requests
from netaddr import IPSet, IPNetwork

prefixes = requests.get('https://ip-ranges.amazonaws.com/ip-ranges.json').json()['prefixes']
# Just output a few random demo regions
demo_regions = ["us-west-2", "us-east-1", "ap-southeast-1"]

def patmyron_method(prefixes):
    regions = defaultdict(lambda: 0)
    sum = 0
    for prefix in prefixes:
        mask = prefix['ip_prefix'].split('/')[1]
        regions[prefix['region']] += 2**(32-int(mask))
        sum += 2**(32-int(mask))
    for region in demo_regions:
        print(region + ": " + str(round(regions[region] / sum, 2)))
    print('total:', sum//1000000, 'million')

def seligman_method(prefixes):
    regions = defaultdict(list)
    for prefix in prefixes:
        cur_network = IPNetwork(prefix['ip_prefix'])
        regions[prefix['region']].append(cur_network)
        regions["_all_"].append(cur_network)
    all_ips_set = IPSet(regions["_all_"])
    for region in demo_regions:
        region_set = IPSet(regions[region])
        print(f"{region}: {len(region_set) / len(all_ips_set) : 0.2f}")
    print(f'total: {len(all_ips_set)//1000000} million')

for x in ["patmyron_method", "seligman_method"]:
    print(f"{'-'*10} {x} {'-'*50}")
    globals()[x](prefixes)

which outputs:

---------- patmyron_method --------------------------------------------------
us-west-2: 0.14
us-east-1: 0.26
ap-southeast-1: 0.03
total: 127 million
---------- seligman_method --------------------------------------------------
us-west-2:  0.15
us-east-1:  0.26
ap-southeast-1:  0.03
total: 66 million

Care must me taken when dealing with netaddr, since it can quickly turn into a O(N^2) problem if you're not careful, doubly so with IPv6 addresses, but if you prepare lists so you only go through IPSet() work a handful of times, it shouldn't be too bad.

I've gotten in the habit of doing this sort of logic for all of the cloud providers, though I think it's really only important for AWS. Truth be told, I'm not sure, it's safer to assume they're all a mess.

PatMyron · 2022-06-01T00:40:10Z

Glad relative region percentages still look similar, that's the main data I was investigating and was hoping duplicates were roughly even between regions until I handled that

Wild us-east-1 alone is in between GCP and Azure in terms of number of IP addresses

seligman · 2022-06-01T01:39:17Z

Yep, looked at in how much it impacts the final charts:

Change:  0.570%: 25.792% -> 26.362%: us-east-1
Change:  0.568%:  6.449% ->  5.881%: eu-central-1
Change:  0.383%:  5.049% ->  5.432%: GLOBAL
Change:  0.296%:  3.000% ->  2.704%: ap-south-1
Change:  0.243%: 14.640% -> 14.397%: us-west-2
Change:  0.218%:  8.500% ->  8.718%: eu-west-1
Change:  0.152%:  1.404% ->  1.252%: eu-west-3
Change:  0.147%:  6.760% ->  6.613%: us-east-2
Change:  0.143%:  1.603% ->  1.460%: ca-central-1
Change:  0.129%:  3.233% ->  3.361%: ap-southeast-1

Most of the regions are below 0.1% off, with a few outliers around .3 to .5%. Certainly good enough to convey the sizes.

( Same basic code tweaked to show changes )

Cyclenerd · 2022-07-20T21:04:02Z

Just wanted to leave a quick thank you here (and here). Your repo has given me the idea to also calculate the IP addresses for Google regions. You can see the result here: https://gcloud-compute.com/regions.html

PatMyron · 2022-07-21T00:47:27Z

@Cyclenerd GoogleCloudPlatform/region-picker#10 would be another great add, but GCP is the one I've never been able to automate scraping that information for:
https://github.com/PatMyron/cloud#product--feature-regional-availability

PatMyron added a commit that referenced this issue May 28, 2022

regions by number of IP addresses

0191a4d

#11

PatMyron added a commit that referenced this issue May 28, 2022

regions by number of IP addresses

2edb67c

#11

PatMyron closed this as completed May 28, 2022

PatMyron reopened this May 30, 2022

PatMyron self-assigned this May 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

region sizes #11

region sizes #11

PatMyron commented Jun 30, 2021 •

edited

Loading

PatMyron commented Sep 20, 2021 •

edited

Loading

PatMyron commented Sep 21, 2021 •

edited

Loading

PatMyron commented May 30, 2022 •

edited

Loading

PatMyron commented May 30, 2022 •

edited

Loading

PatMyron commented May 31, 2022 •

edited

Loading

seligman commented May 31, 2022

PatMyron commented Jun 1, 2022 •

edited

Loading

seligman commented Jun 1, 2022

Cyclenerd commented Jul 20, 2022

PatMyron commented Jul 21, 2022 •

edited

Loading

region sizes #11

region sizes #11

Comments

PatMyron commented Jun 30, 2021 • edited Loading

PatMyron commented Sep 20, 2021 • edited Loading

PatMyron commented Sep 21, 2021 • edited Loading

PatMyron commented May 30, 2022 • edited Loading

PatMyron commented May 30, 2022 • edited Loading

PatMyron commented May 31, 2022 • edited Loading

seligman commented May 31, 2022

PatMyron commented Jun 1, 2022 • edited Loading

seligman commented Jun 1, 2022

Cyclenerd commented Jul 20, 2022

PatMyron commented Jul 21, 2022 • edited Loading

PatMyron commented Jun 30, 2021 •

edited

Loading

PatMyron commented Sep 20, 2021 •

edited

Loading

PatMyron commented Sep 21, 2021 •

edited

Loading

PatMyron commented May 30, 2022 •

edited

Loading

PatMyron commented May 30, 2022 •

edited

Loading

PatMyron commented May 31, 2022 •

edited

Loading

PatMyron commented Jun 1, 2022 •

edited

Loading

PatMyron commented Jul 21, 2022 •

edited

Loading