Money Talk

How we collected the database of all domains in the world

Submitted by meverel, , Thread ID: 227704

Thread Closed
05-12-2021, 04:37 PM
This post was last modified: 05-12-2021, 04:40 PM by meverel
#1
[font=Roboto, -apple-system, "apple color emoji", BlinkMacSystemFont, "Segoe UI", Roboto, Oxygen-Sans, Ubuntu, Cantarell, "Helvetica Neue", sans-serif][font=Roboto, -apple-system, "apple color emoji", BlinkMacSystemFont, "Segoe UI", Roboto, Oxygen-Sans, Ubuntu, Cantarell, "Helvetica Neue", sans-serif][font=Roboto, -apple-system, "apple color emoji", BlinkMacSystemFont, "Segoe UI", Roboto, Oxygen-Sans, Ubuntu, Cantarell, "Helvetica Neue", sans-serif][font=Roboto, -apple-system, "apple color emoji", BlinkMacSystemFont, "Segoe UI", Roboto, Oxygen-Sans, Ubuntu, Cantarell, "Helvetica Neue", sans-serif]Once, in our company, we decided to conduct a market research CMS (Content Management System) - this is a management system, engine, platform or constructor that allows you to manage the content of the site. I was just wondering what share we have in the local market and how much on the overall scale.
The question arose where to get the base of domains for researching various CMS (Magento, Joomla, Drupal). To my surprise, such a base did not exist and still does not exist. And that's what happened next ...[/font]
[/font]
[/font]
[/font]

[font=Roboto, -apple-system, "apple color emoji", BlinkMacSystemFont, "Segoe UI", Roboto, Oxygen-Sans, Ubuntu, Cantarell, "Helvetica Neue", sans-serif][font=Roboto, -apple-system, "apple color emoji", BlinkMacSystemFont, "Segoe UI", Roboto, Oxygen-Sans, Ubuntu, Cantarell, "Helvetica Neue", sans-serif][font=Roboto, -apple-system, "apple color emoji", BlinkMacSystemFont, "Segoe UI", Roboto, Oxygen-Sans, Ubuntu, Cantarell, "Helvetica Neue", sans-serif]
For some separate domain zones, the base could be taken quite simply, but, globally, this was not enough for a full-fledged study. The list of zones can be viewed on the IANA.

A search for options was started ... We are guys from the IT industry and we understood that the situation with domains is changing every day and it is not enough just to download the database once and calm down - we needed constant access and full automation of obtaining information about domains. For example, in the .com zone, 100-140 thousand domains are not renewed every day. Therefore, in a week, our database will contain millions of irrelevant domains.
I must say that checking for employment only seems simple. Each whois service has its own rules and response form. Moreover, they often change and you have to catch such situations. Sometimes, some whois suddenly gave an answer like "query limit" and stopped giving information. I had to create a whole VPS farm to collect information from different IPs and make special delays.
In general, the whole setup took a couple of months of work. And another six months to bring the process to mind and 95% automation. It will not be possible to fully automate - something breaks somewhere, changes and requires attention.

Who needed it?

As a result, we got our own service [/font]
[font=Roboto, -apple-system, "apple color emoji", BlinkMacSystemFont, "Segoe UI", Roboto, Oxygen-Sans, Ubuntu, Cantarell, "Helvetica Neue", sans-serif]https://www.mailbase.shop/?cat_id=43[/font][font=Roboto, -apple-system, "apple color emoji", BlinkMacSystemFont, "Segoe UI", Roboto, Oxygen-Sans, Ubuntu, Cantarell, "Helvetica Neue", sans-serif] to support the current database of domain names sorted by CMS.
It turned out that it was not in vain that we were tormented - such databases were in demand by various companies for their own research. If someone has ideas how and where to get up-to-date information on registered domain names, I will be glad to any comment, as well as ideas on the commercial (and not only) use of the domain base.

The bases can be purchased with the current payment method Bitcoin![/font]
[/font]
[/font]

Users browsing this thread: 1 Guest(s)