Team Chat Logs

April 30, 2010

2010 3
Mo Tu We Th Fr Sa Su
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30    

[04:36:58.486074]<Hodgestar>Eep. The upgrade to schema version 12 doesn't preserve build ids.
[04:39:00.038401]<osimons>Hodgestar: que?
[04:39:41.164060]<Hodgestar>One of wbell's patches added a new field for storing the latest activity time on a build.
[04:39:57.475499]<Hodgestar>I checked the upgrade script but I missed that it doesn't copy the build id.
[04:42:40.912839]<osimons>Hodgestar: a postgres issue, or? looks correct in my db?
[04:43:04.872275]<Hodgestar>osimons: I'm using sqlite.
[04:43:31.393567]<osimons>OH. "preserve" as it renumbers them all?
[04:43:45.335336]<Hodgestar>osimons: What will protect you is having no invalidated builds.
[04:43:57.780335]<Hodgestar>So that your build numbers are already sequential.
[04:44:05.728961]<osimons>yeah.... no holes at all...
[04:44:13.635049]<osimons>hardly likely really...
[04:44:24.603641]<Hodgestar>Assuming that the select happens to return your builds in the same order as before.
[04:45:22.634887]<osimons>that then disconnects the build with its logs and reports... which explains why my lint stuff suddenly disappeard in latest builds
[04:45:28.869022]<Hodgestar>Yep.
[04:45:49.020538]<Hodgestar>I think the fix is fairly trivial -- add the id to the list of things selected and inserted.
[04:45:53.252503]<osimons>Hodgestar: this is somewhat in need of urgent fixing... i suppose you are on it then?
[04:46:10.745607]<Hodgestar>It looks like it. :)
[04:46:19.537161]<osimons>the broken ones are sort of broken already, no getting that back i suppose...
[04:46:48.009863]*osimons looks for time machine backup of the dev database...
[04:47:38.594850]<osimons>Hodgestar: that will get very messy later when it tries to create new log and report tables that already exists and so on...
[04:48:01.646961]<Hodgestar>Indeed.
[04:48:09.593735]<Hodgestar>Fix committed.
[04:48:23.844467]<CIA-37>r866 by hodgestar in bitten/upgrades.py: Preserve build id during latest upgrade.
[04:48:37.350675]<Hodgestar>It's not as well tested as I'd like (my postgres and mysql test setup is at home) but it would be hard to make things worse. :)
[04:50:20.874657]<osimons>Hodgestar: look correct, but i'm more worried about those that may have upgraded this last week
[04:50:32.267232]<Hodgestar>Indeed.
[04:50:43.797940]<osimons>i think it may well deserve a notice on the mailing list
[04:50:53.723250]<Hodgestar>I will see if there is some way to resurrect an affected database.
[04:51:07.983317]<Hodgestar>Will you do the mailing list announcement?
[04:52:19.867790]<osimons>sure
[04:52:31.362873]<Hodgestar>Oh a much less serious note, are you still using the 0.6dev prefix for branch commits?
[04:52:38.497848]<Hodgestar>s/Oh/On/
[04:52:38.507562]<evil_twin>hodgestar meant: On a much less serious note, are you still using the 0.6dev prefix for branch commits?
[04:54:32.978794]<osimons>Hodgestar: prefix, sort of, but does not really matter. it just makes it easier to pick out in my feds and timeline, but no big deal.
[04:55:12.742350]<osimons>i try now with no prefix for trunk, and prefix the branch. it looks cleaner, but we don't really use it for anything.
[04:55:33.911766]<osimons>Hodgestar: you're merging the fix to 0.6 now then i take it? goodie.
[04:55:52.438642]<Hodgestar>Woot! I've found a way to resurrect the database! Hooray for build.stopped and step.stopped!
[04:56:06.809724]<Hodgestar>osimons: Yes. Just merged.
[04:56:24.860048]<Hodgestar>select build.id, step.build from bitten_build as build JOIN bitten_step as step ON build.stopped == step.stopped;
[04:56:35.784625]<Hodgestar>Although it only fixes builds that finished.
[04:56:41.810557]<CIA-37>r867 by hodgestar in branches/ (0.6.x/bitten/upgrades.py 0.6.x): 0.6dev: Merge of [866] from trunk.
[04:56:45.475581]<Hodgestar>Now to turn that into a script of some sort.
[05:00:00.534972]<osimons>Hodgestar: nice, that should work for most use-cases. a new upgrade script then, i suppose?
[05:03:49.989421]<osimons>i'll hold off on the announcement until we know for sure what works and what don't. i'll be around for talk & test for the next hour or so.
[05:05:39.629648]<osimons>Hodgestar: and any invalidated builds last week would then have cleaned out data belonging to another build...
[05:07:20.979961]<osimons>i can think of many corner-cases actually, but on the whole it should be OK-ish if we can recreate the build id's from the best of our ability using step info
[05:08:48.900981]<osimons>oh. sqlite already makes backups when upgrading. had forgotten about that.
[05:09:00.481872]<Hodgestar>Ooh. It does? Where do they live?
[05:09:18.406002]<osimons>in env/db/
[05:09:19.770251]<Hodgestar>Ah.Found them.
[05:09:21.617663]<Hodgestar>Woot/
[05:09:21.968245]<osimons>next to trac.db
[05:09:29.516105]<osimons>each time it upgrades
[05:09:31.598967]<osimons>useful
[05:10:44.575814]<Hodgestar>Then I propose we ditch any attempt at fixing the database with an upgrade script since it's likely to be fraught with danger.
[05:11:09.922095]<Hodgestar>And perhaps just outline the procedure for people using MySQL and PostGreSQL?
[05:11:10.080588]<osimons>basically any revision from 840 to and including 865 is not to be used.
[05:15:09.196964]<osimons>Hodgestar: can we make any assumption that the builds would even be in the right order?
[05:15:39.757411]<Hodgestar>I don't think so.
[05:15:52.795267]<osimons>ie. select from one table => another without order from build, they can in theory be jumbled?
[05:16:00.994695]<Hodgestar>Yes.
[05:16:25.499371]<osimons>that makes it somewhat harder then just trying to detect holes in ranges...
[05:17:51.331047]<Hodgestar>Yeah. I don't think that approach would work.
[05:18:02.560920]<Hodgestar>That was just how I happened to pick up what was happening.
[05:20:40.929166]<Hodgestar>I've just re-run my database upgrade here and the build ids are now correct.
[05:22:13.568852]<Hodgestar>osimons: I can send the email if you like.
[05:23:01.729110]<osimons>Hodgestar: allright then. i had just arrived at the first line of mine, but better you anyway :-)
[05:23:11.778218]<Hodgestar>Things to include: revisions affected, presence of sqlite backups created by trac, ideas on resurrecting postgresql / mysql databases where backups aren't present.
[05:23:19.096917]<Hodgestar>Anything else important?
[05:23:28.175241]<osimons>Simon Cross has just identified a major problem with the upgrade script that was part of revision 840 (and following revision merging it to 0.6 branch). Anyone that have installed/upgraded Bitten using this revision and up to and including revision 865 will have had their build IDs reset due to an error in upgrade script 12. It has now been fixed in revs 866 and 867 for trunk and 0.6 branch.
[05:23:41.002800]<osimons>=> was my intro...
[05:23:49.305974]<Hodgestar>Woot. Thanks.
[05:25:06.241996]<osimons>we could also say; a) development and test setups we just recommend rolling back to latest backups and rerunning using latest source (backups in $tracenv/db/ for sqlite)
[05:26:05.242459]<Hodgestar>I will attempt to extend the upgrade test so that it catches this issue.
[05:26:31.840029]<osimons>b) in theory we could perhaps manage to recreate something using step data, but there may well be a number of corner-cases that will fall outside. not least due to what else may have changed since the upgrade. if anyone has major difficulty rolling back production and rebuilding lasts weeks revisions, then let us know
[05:27:15.436261]<osimons>- and a final reminder not to run bleeding-edge revisions in production... :-)
[05:27:53.317973]<osimons>good catch, Hodgestar.
[05:28:28.684728]<osimons>Hodgestar: a major attention-getting title, like "Major breakage in Bitten revisions 840-865" or similar?
[05:29:17.032301]<davidfraser>Bother
[05:29:33.233567]*davidfraser upgraded our main server to r844 a while ago
[05:29:40.342564]<Hodgestar>osimons: Sounds appropriate.
[05:29:46.175863]<Hodgestar>davidfraser: Are you using sqlite?
[05:29:50.695219]<davidfraser>Hodgestar: Yep
[05:30:02.147079]<Hodgestar>davidfraser: Then you should have a backup available.
[05:30:24.352167]<osimons>Hodgestar: oh. hang on.
[05:30:29.912690]*davidfraser will investigate
[05:30:58.970933]<osimons>Hodgestar, davidfraser: how do we deal with any build logs on disk that have happened last week, and are now "disconnected"?
[05:31:29.711439]<davidfraser>oh bother. We *were* using sqlite. It's on postgres now. But we're supposed to have backups anyway - time to check if they work
[05:31:54.120115]<Hodgestar>osimons: Check file timestamps and remove the logs since the upgrade?
[05:32:25.031217]<osimons>Hodgestar: perhaps. for a manual clean?
[05:32:34.711793]<osimons>should mention that too then.
[05:32:48.108032]<davidfraser>Shouldn't new build logs be associated with new ids though?
[05:33:42.897144]<Hodgestar>davidfraser: Ids might get re-used.
[05:37:36.792095]<davidfraser>Indeed, bother
[05:38:59.304500]<CIA-37>r868 by hodgestar in bitten/tests/upgrades.py: Extend upgrade test to check that build ids are preserved.
[05:41:37.326626]<CIA-37>r869 by hodgestar in branches/ (0.6.x/bitten/tests/upgrades.py 0.6.x): 0.6dev: Merge of [868] from trunk.
[05:42:28.021060]<osimons>nice, Hodgestar ^^
[05:45:51.962889]<osimons>davidfraser: and any invalidated builds last week would already have cleared out the log files on disk from some other build due to id mismatch...
[05:52:21.035895]<CIA-37>r870 by osimons in setup.py: No longer any flash charts in htdocs. Update package data.
[05:53:35.165011]<CIA-37>r871 by osimons in branches/ (0.6.x/setup.py 0.6.x): 0.6dev: Merge [870] from trunk.
[05:55:29.419454]<davidfraser>osimons: Surely the log files on disk are matched up by bitten_log id so shouldn't be overwritten?
[05:56:57.612391]<osimons>davidfraser: sort of, but if build 9 (before) => build 5 (after) get invalidated, wrong logs will be removed if build 5 gets invalidated in this period
[05:57:13.408550]<davidfraser>osimons: Right, sure
[05:57:48.453271]<davidfraser>So as long as there have been no invalidated builds, and we can use Hodgestar's trick to reconstruct old build ids, I might escape unscathed :)
[05:59:02.195142]<osimons>perhaps, perhaps. don't quite know where he went off to now as he timed out... don't really know if he planning a "fix" or not.
[06:25:17.581417]<Hodgestar>I've written most of the email but I need to check what will happen to log files for IDs that have been re-used.
[06:25:44.953893]<Hodgestar>I have to run off now to a work event but I should have a chance to finish the email in an hour or two.
[06:32:19.267271]<davidfraser>Thanks Hogdstar
[08:30:09.640819]<Hodgestar>Mail sent.
[08:30:24.365420]<Hodgestar>The log overwriting was a red herring -- logs use their own ids. :D
[08:34:20.957657]<Hodgestar>I saw some database locking issues while slaves were running just as I was leaving work -- I wonder if the keepalive patch introduced anything there?
[08:34:45.776827]<Hodgestar>I didn't really have time to even confirm the issue before leaving work though so it may have to wait until Monday to be sorted out.
[08:44:18.033736]<Hodgestar>It could just be my slightly odd slave setup -- five slaves connect simultaneously every five minutes.
[09:40:24.541331]<osimons>Hodgestar: good mail. thanks.
[09:40:45.594969]*osimons hopes none but davidfraser actually upgraded production last week...
[09:41:45.216358]<davidfraser>osimons: I foolishly did it to test the charting, without thinking of the implications :)
[09:42:39.328148]<osimons>well, we all learn something i suppose (again) :-)
[09:44:05.323851]<osimons>db (+ protocol bumps) are high impact changes. always carry with them the potential do do much more harm than we think...
[09:45:34.080717]<osimons>davidfraser: it is also partly the fact that now builds and logs are partly db and partly file system. that makes it somewhat more difficult to manage backups and restores, and clean-cut reversals
[09:45:59.483845]<davidfraser>osimons: Indeed
[09:46:17.119388]<osimons>still, not different from trac & attachments, so just the way of the world.
[09:54:50.448624]<Hodgestar>I'm off for the evening. Night all.
[10:57:15.182364]<techtonik>How to make Bitten send notifications to many addresses? Right now it only sends email to author of last commit/
[11:10:28.535881]<osimons>techtonik: i have a hunch it reuses the default settings in [notification] for smtp_always_cc and smtp_always_bcc, but haven't tested it (i think those gets added to all outgoing email). but yes, the bitten logic is to only notify the committer.
[11:11:12.222250]<osimons>techtonik: that said, there is a listener for new builds, so something like announcerplugin and custom notification can easily extend this
[11:14:08.348135]<techtonik>I am not really sure that everybody want to receive notifications about new/updated tickets, but there should global notification email for failed builds.
[11:16:39.945611]<techtonik>How about notify_cc_for_failed_builds option?
[11:17:23.204838]<techtonik>And it is worth to document current notification behavior - http://bitten.edgewall.org/wiki/Documentation/notify.html
[11:22:21.267619]<osimons>techtonik: i'm sure there are some open tickets requesting improved support for notification. i'm not even sure the suggestions i had worked, if so please add a suggestion to patch the docs
[12:17:35.018055]<techtonik>http://bitten.edgewall.org/ticket/576#comment:5
[12:17:44.214573]<techtonik>can be closed
[12:17:48.651817]<techtonik>or deleted