Server Performance

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Server Performance

Etienne Cherdlu
The server performance is really really bad.

I've asked this question several times now and there's been no response.

What can we do about it?

Etienne

_______________________________________________
Openstreetmap mailing list
[hidden email]
http://bat.vr.ucl.ac.uk/cgi-bin/mailman/listinfo/openstreetmap
Reply | Threaded
Open this post in threaded view
|

Re: Server Performance

Erik Johansson-2
On 12/30/05, Etienne Cherdlu <[hidden email]> wrote:
> The server performance is really really bad.
>
> I've asked this question several times now and there's been no response.
>
> What can we do about it?

1. use one of the stand alone editors.
2. buy new hardware.
3. figure out a better way to do it, and implement it.

--
/Erik

_______________________________________________
Openstreetmap mailing list
[hidden email]
http://bat.vr.ucl.ac.uk/cgi-bin/mailman/listinfo/openstreetmap
Reply | Threaded
Open this post in threaded view
|

Re: Server Performance

David Sheldon-5
On Sat, Dec 31, 2005 at 01:57:39AM +0100, Erik Johansson wrote:
> 1. use one of the stand alone editors.
> 2. buy new hardware.
> 3. figure out a better way to do it, and implement it.

Is the code for the tileserver in svn? I've had a look and can't see
anything called "mapserv".

Maybe there can be some optimisation done here, or some better caching.

David
--
A successful [software] tool is one that was used to do something
undreamed of by its author.
                -- S. C. Johnson

_______________________________________________
Openstreetmap mailing list
[hidden email]
http://bat.vr.ucl.ac.uk/cgi-bin/mailman/listinfo/openstreetmap
Reply | Threaded
Open this post in threaded view
|

Re: Server Performance

Mikel Maron
In reply to this post by Etienne Cherdlu
I think some of the response delay is due to the holidays, and Steve is feeling under the weather lately.

Tile Server/DB performance is a big big issue, and it's getting as much attention as possible
(including myself, and I'm juggling a bunch of different drains on my attention).

For some background on the current architecture, problems, potential solutions...
http://bat.vr.ucl.ac.uk/pipermail/openstreetmap-dev/2005-December/000503.html

The WMS configuration and vector drawing scripts are in...
http://www.openstreetmap.org/trac/browser/ruby/api/wms


Discussion is probably more appropriate on the dev list, but some brief thoughts. The needs are for short term optimization, long term rearchitecting, and more hardware. In the short term, we should look at replacing mapserver with a lightweight script .. it's presently only composting different WMS requests together. CPU intensive portions of the tile generation could be replaced with C code (it's possible to inline C in Ruby). In the long term, the architecture should support pregeneration and caching of tiles. Hence the need for more hardware.

Mikel

ps Happy New Year!

----- Original Message ----
From: Etienne Cherdlu <[hidden email]>
To: [hidden email]
Sent: Friday, December 30, 2005 10:55:12 PM
Subject: [Openstreetmap] Server Performance

The server performance is really really bad.

I've asked this question several times now and there's been no response.

What can we do about it?

Etienne

_______________________________________________
Openstreetmap mailing list
[hidden email]
http://bat.vr.ucl.ac.uk/cgi-bin/mailman/listinfo/openstreetmap




_______________________________________________
Openstreetmap mailing list
[hidden email]
http://bat.vr.ucl.ac.uk/cgi-bin/mailman/listinfo/openstreetmap
Reply | Threaded
Open this post in threaded view
|

Re: Server Performance

Christian van den Bosch
Okay, the past day or two the server been unusably slow for several
periods of an hour or more. When it enters this state, even serving / on
http can take several minutes; trying to use the wiki or the applet is
futile. Is it being (D)DOSed? Are there too many concurrent users? Is it
swapping like crazy? Is its load insanely high for some other reason?

More importantly, what can we do to fix it?

Christian / cjb

http://www.cjb.ie/

_______________________________________________
Openstreetmap mailing list
[hidden email]
http://bat.vr.ucl.ac.uk/cgi-bin/mailman/listinfo/openstreetmap
Reply | Threaded
Open this post in threaded view
|

Re: Server Performance

Petter Reinholdtsen

[Christian van den Bosch]
> Are there too many concurrent users?  Is it swapping like crazy? Is
> its load insanely high for some other reason?

You do not mention which time period you talk about, so it is hard to
say.  Do you see anything related on the Munin graphs for the OSM
machines?  <URL:http://bat.vr.ucl.ac.uk/munin/>


_______________________________________________
Openstreetmap mailing list
[hidden email]
http://bat.vr.ucl.ac.uk/cgi-bin/mailman/listinfo/openstreetmap
Reply | Threaded
Open this post in threaded view
|

Re: Server Performance

Christian van den Bosch
Petter Reinholdtsen wrote:

> You do not mention which time period you talk about, so it is hard to
> say.  Do you see anything related on the Munin graphs for the OSM
> machines?  <URL:http://bat.vr.ucl.ac.uk/munin/>

Not seeing anything on bat's graphs anyway. I presume the nightly spike
at 2am is updatedb or something similar.

Troublesome times that spring to mind are mid-afternoon of Dec 30th, and
in the small hours of Dec 31st just after midnight (both of these seemed
to last a couple of hours). There were other times too, but I don't have
an accurate idea of when.

Around 01:30 on Dec 31st (iirc), I tried telnet www.openstreetmap.org 80
and GET / HTTP/1.0, from different IPs in Ireland and the UK, to
eliminate any question about my connectivity at home. Results varied
from not getting a TCP connection, to taking a long time (over a minute)
to answer the GET, to getting a connection but timing out after the GET
was sent.

Christian / cjb

http://www.cjb.ie/

_______________________________________________
Openstreetmap mailing list
[hidden email]
http://bat.vr.ucl.ac.uk/cgi-bin/mailman/listinfo/openstreetmap
Reply | Threaded
Open this post in threaded view
|

Re: Re: Server Performance

Steve Coast
Server load, believe me, is my number one thing to fix right now. It's
swapping and can't really handle the roughly 50,000 hits a day it gets
(see the stats wiki page).

* @ 01/01/06 12:07:38 PM [hidden email] wrote:

> Petter Reinholdtsen wrote:
>
> >You do not mention which time period you talk about, so it is hard to
> >say.  Do you see anything related on the Munin graphs for the OSM
> >machines?  <URL:http://bat.vr.ucl.ac.uk/munin/>
>
> Not seeing anything on bat's graphs anyway. I presume the nightly spike
> at 2am is updatedb or something similar.
>
> Troublesome times that spring to mind are mid-afternoon of Dec 30th, and
> in the small hours of Dec 31st just after midnight (both of these seemed
> to last a couple of hours). There were other times too, but I don't have
> an accurate idea of when.
>
> Around 01:30 on Dec 31st (iirc), I tried telnet www.openstreetmap.org 80
> and GET / HTTP/1.0, from different IPs in Ireland and the UK, to
> eliminate any question about my connectivity at home. Results varied
> from not getting a TCP connection, to taking a long time (over a minute)
> to answer the GET, to getting a connection but timing out after the GET
> was sent.
>
> Christian / cjb
>
> http://www.cjb.ie/
>
> _______________________________________________
> Openstreetmap mailing list
> [hidden email]
> http://bat.vr.ucl.ac.uk/cgi-bin/mailman/listinfo/openstreetmap

have fun,

SteveC [hidden email] http://www.asklater.com/steve/

_______________________________________________
Openstreetmap mailing list
[hidden email]
http://bat.vr.ucl.ac.uk/cgi-bin/mailman/listinfo/openstreetmap
Reply | Threaded
Open this post in threaded view
|

Re: Re: Server Performance

Christian van den Bosch
Hmm, this mail was meant to be a short reply, but I've kept coming back
to it over the last day or so and gone off at a couple of tangents; I've
tried to break it down into digestible chunks.

SteveC wrote:
> Server load, believe me, is my number one thing to fix right now. It's
> swapping and can't really handle the roughly 50,000 hits a day it gets
> (see the stats wiki page).

Ouch - one request every 1.6 seconds or so, assuming even distribution
(if only!). How does the ram usage break down? Is there anything in
particular causing the ram hit? I presume tile generation and/or caching
- for both slippy map and applet - is a big part of it, particularly
given the "fetching unnecessary tiles" behaviour described recently.

Pane size, tile size:

Currently, tiles are 256x128px, and the viewing pane is 700x500px (to
fit on an 800x600 screen?), which works out to 2.73x3.90 tiles; we
always fetch 4x5 tiles (=1024x640px), though sometimes (around 33% of
the time) this means we unnecessarily fetch a row or column or both. If
we changed the pane to 768x512px (3x4 tiles), we would gain around 12%
screen real-estate, still guarantee coverage with 4x5 tiles, and improve
utilisation of the tiles fetched from 53% to 60%; the number of cases
where tiles are fetched unnecessarily becomes vanishingly small (around
  1% of cases), reducing the subjective "I'm waiting for it to finish
doing stuff I can't see" factor, and, as a bonus, reducing requests
generated by further scrolling (intuitively, a user who wants to see
something that's currently 10px outside the pane is likely to drag by 50
or 100px to give a "comfort zone", with a high likelihood of generating
tile requests - but if what he wants is already within the pane with
little or no margin, he won't scroll at all).

If it's not feasible to change the viewing pane size, should we consider
changing the tile size to something that fits the viewing pane better,
even if it's not in powers of two? Or is this likely to cause us massive
pain elsewhere?

Supertiles:

Would it be feasible for the server to combine requests for contiguous
tiles into a super-tile operation (perhaps on receiving an "I'm about to
ask for these 20 tiles" notification from the client), then splitting
that into the requested tiles and putting these into the cache ready for
the "proper" request?

This would significantly reduce (by a factor of 20, assuming no caching)
the number of node/segment select operations to generate a given group
of tiles, while the total number of rows returned would reduce slightly,
because any segments that cross a tile boundary will be returned once
instead of twice - as an added bonus, reducing (but not eliminating) the
artifacts we currently see where segments are shown in tiles containing
their end nodes, but go AWOL in intermediate tiles.

False negatives in segment selection:

Presumably, this happens because the current tile generation mechanism
selects only nodes within the tile (something like SELECT node WHERE
node.x >= tile.minx AND node.x <= tile.maxx AND node.y >= tile.miny AND
node.y <= tile.maxy), and then selects all segments referencing one or
more of those nodes, not allowing for segments that have both ends
outside the tile but still intersect it. (I'm conveniently ignoring edge
effects at +/-180°).

Reducing false negatives:

Intuitively, something like SELECT node WHERE node.x >= tile.minx-slop
AND node.x <= tile.maxx+slop AND node.y >= tile.miny-slop AND node.y <=
tile.maxy+slop will reduce the number of false negatives, but this is
sloppy and will return a (potentially large) number of false positives.

Eliminating false negatives:

The "correct" way of identifying segments that cross our tile would
presumably be to work out the line equation for each segment, and use
simultaneous equations to fill in the tile edge values to each line
equation for each segment, then check to see where the endpoints of the
segment are in relation to the tile. This would, however, be CPU-bound,
and gives no possibility of an indexable query.

Eliminating false negatives efficiently:

Taking this approach a little further, a little geometry lets us
conclude that if both ends of a segment fall beyond the same edge of a
tile, the tile and segment won't intersect; we could use this, but it
means our select making twice as many comparisons as it currently does,
something like (forgive my pseudo-SQL): SELECT segments WHERE NOT (
(segment.node1.x < tile.minx AND segment.node2.x < tile.minx) OR
(segment.node1.x > tile.maxx AND segment.node2.x > tile.maxx) OR
(segment.node1.y < tile.miny AND segment.node2.x < tile.miny) OR
(segment.node1.y > tile.maxy AND segment.node2.x > tile.maxy) ).

However, unlike the previous model, this is indexable (indeed, we're
really only interested in comparing the greater of node[1,2].x with
minx, so we could store that in advance, halving query complexity at the
cost of a much messier table, something like SELECT segments WHERE NOT (
segment.rightnode.x < tile.minx OR segment.leftnode.x > tile.maxx OR
segment.topnode.y < tile.miny OR segment.botnode.y > tile.maxy).

This query will result in a few (I think, surprisingly few) false
positives, but is far more efficient overall than checking every segment
for intersection with the tile. False positives are less of a concern
than false negatives in this case.

Reducing server load:

I can see that doubling the complexity of a select on an already
overloaded server could be an issue; however, if we were to combine this
with the super-tiling approach, we'd still be roughly an order of
magnitude better off than we are now, though our code complexity would
have increased a bit.

Other ramblings:

How does RAM and CPU usage on the server break down between map viewing,
editing, wiki, trac, mail, other?

How much space does bat have for more RAM, and would this be a good
target for a funding drive?

I have a dual Xeon with 1G of ram and around 80G of disk free, running
Ubuntu Hoary, that's currently doing very little other than generating
heat, but it's at the wrong end of a very asymmetric (2M/128kbit) DSL
link. Is theres any processing, compilation, development work, whatever,
that could usefully be offloaded to that?

Apologies again for the length of this mail!

Cheers,

Christian / cjb

http://www.cjb.ie/


_______________________________________________
Openstreetmap mailing list
[hidden email]
http://bat.vr.ucl.ac.uk/cgi-bin/mailman/listinfo/openstreetmap
Reply | Threaded
Open this post in threaded view
|

Re: Re: Server Performance

Raphael Jacquot-2
Christian van den Bosch wrote:

> Currently, tiles are 256x128px, and the viewing pane is 700x500px (to
> fit on an 800x600 screen?), which works out to 2.73x3.90 tiles; we
> always fetch 4x5 tiles (=1024x640px), though sometimes (around 33% of
> the time) this means we unnecessarily fetch a row or column or both. If
> we changed the pane to 768x512px (3x4 tiles), we would gain around 12%

800x600 is obsolete.
the pane size should be 768*512

> Supertiles:
>
> Would it be feasible for the server to combine requests for contiguous
> tiles into a super-tile operation (perhaps on receiving an "I'm about to
> ask for these 20 tiles" notification from the client), then splitting
> that into the requested tiles and putting these into the cache ready for
> the "proper" request?

that would make the javascript more complicated.
perhaps, asking for the tiles in a spiralling fashion as google maps do
it may be simpler

> This would significantly reduce (by a factor of 20, assuming no caching)
> the number of node/segment select operations to generate a given group
> of tiles, while the total number of rows returned would reduce slightly,
> because any segments that cross a tile boundary will be returned once
> instead of twice - as an added bonus, reducing (but not eliminating) the
> artifacts we currently see where segments are shown in tiles containing
> their end nodes, but go AWOL in intermediate tiles.

true, but it would reduce the effectiveness of the browser's caching
mecanism

> Eliminating false negatives efficiently:
>
> Taking this approach a little further, a little geometry lets us
> conclude that if both ends of a segment fall beyond the same edge of a
> tile, the tile and segment won't intersect; we could use this, but it
> means our select making twice as many comparisons as it currently does,
> something like (forgive my pseudo-SQL): SELECT segments WHERE NOT (
> (segment.node1.x < tile.minx AND segment.node2.x < tile.minx) OR
> (segment.node1.x > tile.maxx AND segment.node2.x > tile.maxx) OR
> (segment.node1.y < tile.miny AND segment.node2.x < tile.miny) OR
> (segment.node1.y > tile.maxy AND segment.node2.x > tile.maxy) ).

better yet, select all lines, and do the calculations in the script
prior to drawing the lines
of course, the better solution would be to contribute geometry culling
to cairo

_______________________________________________
Openstreetmap mailing list
[hidden email]
http://bat.vr.ucl.ac.uk/cgi-bin/mailman/listinfo/openstreetmap
Reply | Threaded
Open this post in threaded view
|

Re: Re: Server Performance

Christian van den Bosch
Raphaël Jacquot wrote:

> 800x600 is obsolete. the pane size should be 768*512

"Obsolete" is a bit strong, particularly given that we're talking about
users of the viewer as well as the editor. Maybe it would be best to use
javascript to set the pane size to fit in the current browser window,
with a sensible maximum like 768x512?

>> Would it be feasible for the server to combine requests

> that would make the javascript more complicated.

Not really! The javascript (client) knows what tiles it will request,
and telling the server this should be trivial - the server side would
become rather more complex though, yes.

> perhaps, asking for the tiles in a spiralling fashion as google maps do
> it may be simpler

I like this idea; it should make the user experience seem faster, with
no server side changes. However, it will still fetch the same tiles,
just in a different order, so server load will be unaffected. What I'm
mainly focussing on, in this thread, is reducing server load (with a
tangent about rendering bugs).

> it would reduce the effectiveness of the browser's caching mecanism

It shouldn't affect the browser's caching mechanism at all; I did
specify that the client still receives the same tiles as before, but
gives the server slightly more information first, so it can generate
uncached tiles in one operation instead of (some number up to) 20, and
then serve them to the client from the newly-generated cache.

Often, we will find that the server has recently generated some chunk of
the requested tiles already; in this case we can treat a row or column
on its own as a supertile, or generate individual tiles if that seems
most efficient, or even speculatively generate more tiles in the
unexplored direction to anticipate the next request.

I would try to code this, but I'm up to my eyes in other projects for
the next few months.

> better yet, select all lines,

Do you really mean all lines in the database? Surely, eliminating nearly
all lines we don't want, while guaranteeing we get all the ones we do
want, has to be better than that?

 > and do the calculations in the script prior to drawing the lines

No harm, but...

> of course, the better solution would be to contribute geometry culling
> to cairo

...the line drawer itself, if at all well implemented, should be able to
return very quickly when there's nothing to be done for the given line
in the given tile / super-tile. To do a yes/no calculation for each line
before calling the line drawer may even be duplication of effort, given
that the line drawer will perform an almost identical calculation in
order to figure out where to draw the line.

Christian / cjb

http://www.cjb.ie/


_______________________________________________
Openstreetmap mailing list
[hidden email]
http://bat.vr.ucl.ac.uk/cgi-bin/mailman/listinfo/openstreetmap