|
>>
|
No. 6305
In the first thread about /local/ I said that basing the location on the IP address would be the way to go. Doing a full geolocation lookup on each visitor would probably be far too slow.
That is, to somehow connect different IP subnets to different geographical areas. One could do this by looking up the IP adresses against actual geographical locations. However here's how I would do it:
I'd analyze visitors purely base on their IP address. How this would work is that you'd define a target value for how many people should be included to each virtual board, let's say 100 people for the sak of argument. (Which is quite impossible since we all know that 7chan only has 5 simultaneous users. But anyway, bear with me.)
What happens then is that you look for clusters with lots of users. Let's say at some point that 1.2.x.y has 200 users (viewers, not just posters to make things more effective) Then you'd split that subnet into two.
So you get two new subnets, 1.2.{0-127}.y and 1.2.{128-255}.y
Just to make it really clear: The subgroup of "all addresses beginning with 1.2.*.*" is being split into "all addresses beginning with 1.2.0.* to all addresses beginning with 1.2.127.*" and "all addresses beginning with 1.2.128.* to all addresses beginning with 1.2.255.*". Then if one of these subgroups get too many users, you split it again, eg 0-127 becomes 0-63 and 64-128.
The observant anon will notice a few problems with this method however.
1) Even though these subnets will roughly correspond to a geographical area (down to a certain point) it also means people with the same ISP will be grouped together, since an ISP will usually own a huge subnet.
2) It gets rather inaccurate for in the small scale where an ISP distributes say 65536 addresses randomly over a large area.
3) The subgroups will never overlap.
4) The code of Kusaba would have to be modified. (Obviously) I might even make sense to completely ditch the static html page system for /local/. The use of static HTML is based on the idea that a static pages are more efficient for a relatively small number of pages. That is, it costs less to processing power to fetch 1 of 10 static pages or 1 of ~1000 static threads than it would to fetch a comb ination of posts with an SQL query. This assumption might actually not be true even for regular boards with the rise of better DB engines over the years, but the point I'm coming to is that it's likely more effective to generate the /local/ pages dynamically. Even if done with static pages, you'd still need to have some sort of filtering mechanism. I'd recommend DB over, eg .htaccess for this task, since .htaccess has O(n) time complexity for item lookups, whereas a DB can possibly do it faster, given a good algorithm. Of course, this point is true for any implementation of /local/.
Even my proposed method has the disadvantages 1), 2) and 3), it has the huge advantage that it can be implemented without spending money on a geolocation service. And even with the disadvantages listed, I think it would work well enough for a first test implementation. And it would also return some interesting demographic data which could be useful if you one day decide to go for a fullblown geolocation service, this data, coupled with geolocation data would allow you make a more accurate, but still CPU friendly implementation
Just my 2 cents. If you're interested in discussing it further I'll drop on IRC. Bye for now.
|