-  [WT]  [PS]  [Home] [Manage]

[Return]
Posting mode: Reply
Name
Email
Subject   (reply to 6289)
Message
File
Password  (for post and file deletion)
  • Supported file types are: GIF, JPG, PNG, TXT
  • Maximum file size allowed is 5120 KB.
  • Images greater than 200x200 pixels will be thumbnailed.
  • Currently 385 unique user posts. View catalog

  • Blotter updated: 2009-07-22 Show/Hide Show All

File 125872185188.jpg - (280.28KB , 942x1449 , 1181208617331.jpg )
6289 No. 6289
Hi guys, I'm crossposting >>/b/389653 here to get /pr/'s help on this issue for the new /local/ board idea I'm working on, but still don't have a really concrete way to approach this using PHP. I'd appreciate some input on a good way to approach this, from a programming standpoint.
Expand all images
>> No. 6298
Have a page where if it finds a cluster within 100 miles of your own ip, it automatically takes you there, and if there isn't a cluster with 100 miles, you can make your own, or see other clusters farther away from you. Whatever you choose, it will take you to the local board with your post data. Each cluster should have it's own table, with it's creators GeoIp, and the messages inside that cluster. Oh yes, and if the cluster is inactive for more than 20 minutes, delete it. Just some ideas, sorry for the run-on sentences.
>> No. 6300
And also, OP, have you thought about using latitude and longitude instead of miles? GeoIP includes that, too.
>> No. 6305
In the first thread about /local/ I said that basing the location on the IP address would be the way to go. Doing a full geolocation lookup on each visitor would probably be far too slow.
That is, to somehow connect different IP subnets to different geographical areas. One could do this by looking up the IP adresses against actual geographical locations. However here's how I would do it:

I'd analyze visitors purely base on their IP address. How this would work is that you'd define a target value for how many people should be included to each virtual board, let's say 100 people for the sak of argument. (Which is quite impossible since we all know that 7chan only has 5 simultaneous users. But anyway, bear with me.)

What happens then is that you look for clusters with lots of users. Let's say at some point that 1.2.x.y has 200 users (viewers, not just posters to make things more effective) Then you'd split that subnet into two.
So you get two new subnets, 1.2.{0-127}.y and 1.2.{128-255}.y
Just to make it really clear: The subgroup of "all addresses beginning with 1.2.*.*" is being split into "all addresses beginning with 1.2.0.* to all addresses beginning with 1.2.127.*" and "all addresses beginning with 1.2.128.* to all addresses beginning with 1.2.255.*". Then if one of these subgroups get too many users, you split it again, eg 0-127 becomes 0-63 and 64-128.

The observant anon will notice a few problems with this method however.
1) Even though these subnets will roughly correspond to a geographical area (down to a certain point) it also means people with the same ISP will be grouped together, since an ISP will usually own a huge subnet.
2) It gets rather inaccurate for in the small scale where an ISP distributes say 65536 addresses randomly over a large area.
3) The subgroups will never overlap.
4) The code of Kusaba would have to be modified. (Obviously) I might even make sense to completely ditch the static html page system for /local/. The use of static HTML is based on the idea that a static pages are more efficient for a relatively small number of pages. That is, it costs less to processing power to fetch 1 of 10 static pages or 1 of ~1000 static threads than it would to fetch a comb ination of posts with an SQL query. This assumption might actually not be true even for regular boards with the rise of better DB engines over the years, but the point I'm coming to is that it's likely more effective to generate the /local/ pages dynamically. Even if done with static pages, you'd still need to have some sort of filtering mechanism. I'd recommend DB over, eg .htaccess for this task, since .htaccess has O(n) time complexity for item lookups, whereas a DB can possibly do it faster, given a good algorithm. Of course, this point is true for any implementation of /local/.

Even my proposed method has the disadvantages 1), 2) and 3), it has the huge advantage that it can be implemented without spending money on a geolocation service. And even with the disadvantages listed, I think it would work well enough for a first test implementation. And it would also return some interesting demographic data which could be useful if you one day decide to go for a fullblown geolocation service, this data, coupled with geolocation data would allow you make a more accurate, but still CPU friendly implementation

Just my 2 cents. If you're interested in discussing it further I'll drop on IRC. Bye for now.
>> No. 6308
IP addresses tell you the location of one's ISP, and not the local of the terminal. The choice of board should NOT be automatically determined by IP. That is just asking for fuck-up. My ISP is 25 miles away, on the other side of the largest city in my state that I live nearby. People should choose. One's IP would work well with "recommendations", but it shouldn't to any extent be a restriction. If I want to browse /local/-Tokyo while in America, I should.


tl;dr: This, a million times this:
>>6298
>> No. 6330
>>6308
If people can choose their localtion, what's the point of having a /ocal/ board in the first place? People will just flock to big cities and use that as a second /b/, without being very /local/ at all.

Also let me clarify why IP masking can be a good alternative to actual geolocation, at least during a trial period.
1) If you were to use an actual geolocation service on this scale, you'd probably have to pay for it. Since 7chan is known to have so much money, that's an excellent idea.
2) If you were to base the groups solely on geographical location you might end up with a empty virtual boards for lots of users. Since 7chan is known to have such a large userbase, that's an excellent idea.
3) If you were to calculate the actual distance from yourself to every other poster, that would make the DB engine use a lot of processing power. (Cannot use an index for an expression like WHERE sqrt(d1*d1+d2*d2) < 10) Actually, you [i]could[i] write a more efficient query for that, eg by defining the /local/ area as a square instead of a circle, but still...
>> No. 6343
File 125884215953.png - (30.21KB , 958x650 , i\'m a faggot.png )
6343
what has science done?!?!!
>> No. 6345
>>6330
well, then i don't care what you do because i won't even use it. why the fuck do i want to contact local people anyway. that's what craigslist is for. and IRL.
>> No. 6365
How about fuck the /local/ idea, let's make it the /nonanon/ board. At least it's better than whatever 4ailchan is capable of.
>> No. 6386
I just accidentally came into this board, and now this intrigues me. I've never heard of this /local/ idea, so anyone want to explain exactly what it is?
>> No. 6391
Like I posted in /b, this will not work as you will allways end up with the location of the ISP. The only solution is getting users to manually pick their city, as you can get the country from IP.
>> No. 6400
I don't know how things are in USA, but in my small East_European nation ISP:s honer IP ranges for different locations, so atleast the adultfriendfiner adds know your location by the nearest town, even if the ISP do operate over the whole country.
>> No. 6424
>>6400
But it will allways be the ISPs location.
eg. In NZ it will always show that you are in Auckland cos it only has 1 gateway
>> No. 6510
Reverse DNS lookups for most home ISP systems usually provide the state in which it is based. You could route requests according to that.

(e.g. Comcast: c-XXX-XXX-XXX-XXX.hsd1.tn.comcast.net)

If there is one local board for each state, it would solve all the problems that >>6330 mentions.

(Dialup users, and Tor users should be banished to their own /local/ board anyway)
>> No. 6511
>>6391 here.
I just tried to log into 7track and apparently my account is disabled due to inactivity, and I had 10GB worth of up :( is there a way to recover it or register for a new 1? Sorry, dunno where else to ask and a mod will check this thread eventually.
>> No. 6570
>>6511
Post your username here and I will re-enable at anontrack.
>> No. 6607
If people just want to flock to big cities, then what's point of a /local/?

The idea has to be based on some kind of actual want or need with the users. If they don't need a /local/ then it's a fail anyway. Let people choose what local board they want to see. If everyone chooses "tokyo" then rename /local/ to /tokyo/ and be done with it...
>> No. 6727
>>6570
Thankyou very much.
username is Red
>> No. 6830
>>6727
Done, no problem.

>>6607
This thread is not meant to discuss the idea itself; rather, its purpose is to discuss how to implement it on the site using PHP.
>> No. 6831
>>6727
No problem.

>>6607
The point of this thread is not to debate the point of the project itself; rather, the purpose is to discuss how to implement the board on the site using PHP.
>> No. 6836
>>6831
Thankyou very much :)
So this does not happen again, how long before it gets deactivated from inactivity?
>> No. 6923
It's me again...
I'm still withholding that my idea is a good start. Evidently it has disadvantages, above all the same-ISP clustering, but it would be relatively easy to implement and even if it's flawed, it could give you some valuable statistics. If you want me to come on IRC and discuss this further, let me see some pink text in double hash signs.
>> No. 7249
I've worked on something like this myself for a pet project.

I decided that doing lookups based on the IP would be too restrictive to not only the host's location, but it would prevent collaboration in outside areas. Should I be visiting another state, I would still like to be able to log into my hometown's site.

I decided using user specified zipcodes would be a better way. Let them specify where they want to be centered, or let them round it off for anonymity. I could say I want to see a board full of 53*** and you only know I'm from wisconsin.

Also, doing proximity clustering is expensive, especially so in PHP where each hit gets re-generated. An easy way to do it, if using zipcodes, is to allow a range. Posts have an associated zipcode, and I might browse my zipcode plus or minus fifty, giving the whole half of my state, or my zipcode plus or minus five would give just the area I'm in.

So each user has a threshold plusminus number that is set on a page form then stored to cookie (no user accounts required), and a center zipcode also stored to a cookie. The server would have to start storing zipcodes on every post, and do a thresholding on select.

Select * from posts where zip > (zip-thresh) and zip < (zip+thresh);

It will require users to submit a zipcode with a get request, which is a minor privacy violation, but they should realize that using a locals board requires the server to know where you are!

I think its pretty straightforward, but integrating it to your existing system might be hard, that's why I was creating one fresh.
>> No. 7250
>>7249

Oi. I just remembered that my system was designed for postings like bulletin boards, not for discussion... my system would permit someone on a border region seeing only half of a conversation.
Instead you need clustering into groups to prevent cross border half-conversation visibility.

Still, I highly reccomend a user supplied zipcode method. I would rather view 53*** than anything related to my IP's location. Also proxies. A direct numeric lookup is always faster than GeoIP too.
>> No. 7265
>>7250

Couldnt you just specify the starter of the thread's zip code range as the thread's location? I haven't looked at imageboard source code and I'm just now learning PHP so I was just wondering as I think this through...
>> No. 7287
>>7265
Specifying the post's location as a range adds a variable that wasn't there when each post only had a single value. Pulling relevant posts with SQL just got a bit trickier now that each post has a min and max value for the index.
>> No. 7381
How about international audience?

Yeah, at the beginnings they can ignored, but still better not hard code the 5 digit US system too deep; that would rule out even Canadians.
>> No. 7574
>>7381
You could just make a separate function for different kinds of postal codes.
>> No. 7964
localchan.org

That site was posted on fourtwenty chan, and after I chastised the owner for apparently stealing 7chans idea he said he would gladly share his php code with you guys. The thread is on /tech/ or /prog/, I cant remember which exactly.
>> No. 7966
how do u right php codes
>> No. 8003
Javascript based client side word filters.
Just ask each user what city they're in, and then make the wordfilter all city names to be whatever city they're from.
>> No. 8349
Just do it Craigslist style where it lists every general area.
>> No. 8540
You do know that firefox and other shit browsers have Geolocating software enabled by default, just no one knows about it or how to use it.
>> No. 8541
http://www.mozilla.com/en-GB/firefox/geolocation/#geo-demo

it is unbelievably accurate. Best way to go about this without spending money.

The alternative is to download (buy) a massive database with the first part of the IP addresses associated with a city and use that method.
>> No. 8614
>>6305
It is an interesting approach, but geolocation lookups can work if you do it for whole subnets, so not 196.123.123.1, but 196.123.123.* and then cache the results locally.
>> No. 8738
<b>bold</b> <i>italics</i>
[Return]


Delete post []
Password  
Report post
Reason