8 Feb 2002 20:03
8 Feb 2002 20:39
Re: [Wikipedia-l] Plans for RecentChanges page; programmers only!
(moving thread to wikitech-l) Jan Hidders wrote: >From: "Brion L. VIBBER" <brion@...> > >>Jan Hidders wrote: >> >>>Dear fellow programmers, >>> >>>I saw that the SQL code for the Recent Changes page is rather inefficient >>>and causes a lot of database access, so I decided to improve this. >>> >However, > >>>I can only do this properly if the timestamp field in the tables is split >>> >in > >>>a day and a time field. >>> >>Can I ask how exactly that would help? I'm not much of a database guru, >>so the answer isn't obvious to me and I'm a bit curious. >> > >It allows me to do a GROUP BY on the day. That way I can take a left outer >join between the cur table and the old table and group on the combination of >cur_day and cur_title. This allows me to get all the information I need for >the page in one SQL statement. >(Continue reading)
8 Feb 2002 21:06
Recent changes; flush cache
Jan's change to the database has my blessing, as he is a (the!) database expert. Anyway, it isn't *my* baby anymore, it is *ours*;) Yeah, the blessings of modern cloning technology... As for the cache flushing mechanism Brion proposed in a private mail, the easiest way would be to link it to the cur_counter field; if "cur_counter MOD 20 == 0", the cache could be flushed, best by setting the local $cache variable to "" at the end of the load routine. 20 is just a wild guess... Finally, thanks to Jimbo for setting up this mailing list. Gee, I never caused a mailing list before ;) Magnus
8 Feb 2002 21:51
Questions about installing
When I check everything out of the CVS, I do it into a directory that has nothing to do with the real site. Then, I copy the files over to the proper location. It seems that wikiText.php AND wikiTextEn.php are always different, and I have to edit them... so really, I shouldn't be copying them unless there's a good reason, right? Here's my exact question: wikiTextEn.php warns me: # ATTENTION: # To fit your local settings, PLEASE edit wikiText.php ONLY! # Change settings here ONLY if they're to become global in all wikipedias! But that seems a bit "opposite" to me... doesn't wikiTextEn mean "wikiText English"? If so, then changes here should ONLY affect the English wikipedia, not "global in all wikipedias"? Also, whichever way it is supposed to be, I'm sure I should only have to edit one file. But I have to edit two. First, $wikiCurrentServer returns http://wikipedia.com in the default configuration, but we prefer http://www.wikipedia.com/ (see line 12 of wikiTextEn.php, I always edit to hardcode this.) And on the next line, $wikiSQLServer is different locallly: the database is named "wiki" instead of "wikipedia".(Continue reading)
8 Feb 2002 22:38
Re: Questions about installing
Jimmy Wales wrote:
>wikiTextEn.php warns me:
># ATTENTION:
># To fit your local settings, PLEASE edit wikiText.php ONLY!
># Change settings here ONLY if they're to become global in all wikipedias!
>
>But that seems a bit "opposite" to me... doesn't wikiTextEn mean "wikiText English"?
>If so, then changes here should ONLY affect the English wikipedia, not "global in
>all wikipedias"?
>
My understanding is that wikiTextEn.php is going to contain all the
default values. For the other-language versions, alternate server name,
character set, message strings, etc. will be set in e.g. wikiTextDe.php
or wikiTextPl.php. The theory is that anything that's *not* set in the
language-specific file will get the default value; thus if a new feature
is added that needs a message string $wikiFooBar, the English message
defined in wikiTextEn.php will show up, rather than nothing, if the
local wikiTextXx.php isn't updated.
Ultimately, wikiText.php should probably be nothing more than:
include("wikiTextEn.php");
include("wikiTextSomeOtherLanguage.php");
Anything additional in that file might be for site-specific data; for
instance if somebody sets up a read-only mirror of Wikipedia, they could
customize just that file to include their alternate server name, a title
string that links to the live 'pedia, and a hypothetical
refuse-all-edits option.
(Continue reading)
8 Feb 2002 22:35
Re: Questions about installing
Brion L. VIBBER wrote: > But then, should it be called wikiText.php at all? Would > wikiSettings.php make more sense, maybe? Yes! Or, wikiLocalSettings. This would be settings which override whatever may be in the "default" package, but specific to this _site_. For example, on Magnus's machine, the database is 'wikipedia', and the user/password for mysql are different. So those would all go in the wikiLocalSettings file. > I think the wikiTextEn.php defaults should be what you're actually using > on the English wikipedia! Right, especially for internationalization stuff. What might make sense would be for us to have wikiSettings or wikiLocalSettings, and that's where stuff goes that we are _fairly confident_ will be different on different people's machines.
8 Feb 2002 22:43
Re: Re: [Wikipedia-l] Plans for RecentChanges page; programmers only!
Jan Hidders wrote: >From: "Brion L. VIBBER" <brion@...> > >>Right. Hmm, can you use TO_DAYS(cur_timestamp) or some such? Or is that >>just going to cause problems? >> > >Unfortunately MySQL only allows column names in the GROUP BY clause. > D'oh! Well then, two columns it is. -- brion vibber (brion <at> pobox.com)
8 Feb 2002 23:35
Speedup suggestions
Browsing around on http://www.mysql.com/documentation/ and comparing with the wikipedia.sql in cvs, I have the following suggestions: 1) To speed up searches, we should use a FULLTEXT index on title and text, and then use the match operator. That should also yield more relevant results. (http://www.mysql.com/doc/F/u/Fulltext_Search.html) 2) In special_recentchanges.php, we select with "WHERE cur_timestamp>$mindate", but cur_timestamp is not indexed. This means that mysql linearly searches through the whole cur database, everytime somebody views RecentChanges. 3) Assuming that php runs as an apache module, we should use persistent database connections. That way, we won't repeatedly send over the username and password; one connection is reused by apache even after the php script dies. (http://www.phpbuilder.com/manual/features.persistent-connections.php and http://www.phpbuilder.com/manual/function.mysql-pconnect.php) Axel
8 Feb 2002 23:50
Re: Speedup suggestions
> 2) In special_recentchanges.php, we select with "WHERE > cur_timestamp>$mindate", but cur_timestamp is not indexed. This > means that mysql linearly searches through the whole cur database, > everytime somebody views RecentChanges. Why not simply log change-events to a separate table, and display data from that log depending on the user's filter? Changes older than, say, two weeks could be discarded by a cronjob. -- Daniel
8 Feb 2002 23:56
Re: Speedup suggestions
Axel Boldt wrote: > 3) Assuming that php runs as an apache module, we should use > persistent database connections. That way, we won't repeatedly send > over the username and password; one connection is reused by apache > even after the php script dies. > (http://www.phpbuilder.com/manual/features.persistent-connections.php > and http://www.phpbuilder.com/manual/function.mysql-pconnect.php) After reading these two pages, it seemed that all I needed to do was change mysql_connect in databaseFunctions.php to mysql_pconnect. It's a little early to get *too* excited, but so far the results seem astonishing! This is the first time I've seen the 5 minute load on the machine under 3 in days. And it's 0.46 now. Also, I'm getting 2.52 pages per second from my little benchmarking tool. This is dramatically better than the 1/2 page per second pace we've been seeing. --Jimbo
RSS Feed