MemoryError when repacking repository with large numbers of revisions

Bug #864217 reported by Mark Grandi
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Bazaar
Confirmed
Medium
Unassigned
Breezy
Triaged
Medium
Unassigned

Bug Description

I was trying an experiment where I would convert the linux kernel git repository into a bazaar repository. However, when I was using 'fast-import', bazaar ran out of memory at "repacking" the repository when i had about 16,000 revisions out of 264,207

56378.291 Traceback (most recent call last):
  File "bzrlib\commands.pyo", line 946, in exception_to_return_code
  File "bzrlib\commands.pyo", line 1150, in run_bzr
  File "bzrlib\commands.pyo", line 699, in run_argv_aliases
  File "bzrlib\commands.pyo", line 721, in run
  File "bzrlib\cleanup.pyo", line 135, in run_simple
  File "bzrlib\cleanup.pyo", line 165, in _do_with_cleanups
  File "C:/Program Files (x86)/Bazaar/plugins\qbzr\lib\commands.py", line 821, in run
  File "C:/Program Files (x86)/Bazaar/plugins\qbzr\lib\subprocess.py", line 888, in run_subprocess_command
  File "bzrlib\commands.pyo", line 1150, in run_bzr
  File "bzrlib\commands.pyo", line 699, in run_argv_aliases
  File "bzrlib\commands.pyo", line 721, in run
  File "bzrlib\cleanup.pyo", line 135, in run_simple
  File "bzrlib\cleanup.pyo", line 165, in _do_with_cleanups
  File "C:/Program Files (x86)/Bazaar/plugins\fastimport\cmds.py", line 314, in run
  File "C:/Program Files (x86)/Bazaar/plugins\fastimport\cmds.py", line 40, in _run
  File "C:/Program Files (x86)/Bazaar/plugins\fastimport\processors\generic_processor.py", line 311, in process
  File "fastimport\processor.pyo", line 76, in _process
  File "C:/Program Files (x86)/Bazaar/plugins\fastimport\processors\generic_processor.py", line 556, in commit_handler
  File "C:/Program Files (x86)/Bazaar/plugins\fastimport\processors\generic_processor.py", line 510, in checkpoint_handler
  File "C:/Program Files (x86)/Bazaar/plugins\fastimport\processors\generic_processor.py", line 429, in _pack_repository
  File "bzrlib\decorators.pyo", line 217, in write_locked
  File "bzrlib\repofmt\pack_repo.pyo", line 1805, in pack
  File "bzrlib\repofmt\pack_repo.pyo", line 997, in pack
  File "bzrlib\repofmt\pack_repo.pyo", line 1019, in _try_pack_operations
  File "bzrlib\repofmt\pack_repo.pyo", line 945, in _execute_pack_operations
  File "bzrlib\repofmt\pack_repo.pyo", line 714, in pack
  File "bzrlib\repofmt\groupcompress_repo.pyo", line 490, in _create_pack_from_packs
  File "bzrlib\repofmt\groupcompress_repo.pyo", line 460, in _copy_chk_texts
  File "bzrlib\groupcompress.pyo", line 1751, in _insert_record_stream
  File "bzrlib\repofmt\groupcompress_repo.pyo", line 299, in next_stream
  File "bzrlib\groupcompress.pyo", line 1474, in get_record_stream
  File "bzrlib\groupcompress.pyo", line 1580, in _get_remaining_record_stream
  File "bzrlib\groupcompress.pyo", line 2150, in get_build_details
  File "bzrlib\groupcompress.pyo", line 2094, in _get_entries
MemoryError

Attached is the bzr log (very very long) of when i started the fast-import to when it crashed.

I tried to restart the fast-import, but then I ran into this bug: https://bugs.launchpad.net/bzr/+bug/541626 , and now I seem unable to complete it because it keeps crashing with that error.

Also attached is my dxdiag output, which someone on irc told me to add so it helps to diagnose the problem.

The exact procedure I did, was I used git to clone the repository at "https://github.com/torvalds/linux.git", then i ran "git fast-export --all > linux.fe" to create the fast export file . Then using bazaar explorer i ran "bzr fast-import C:/Users/Mark/Desktop/tmp_linux/linux.fe C:/Users/Mark/Desktop/tmp_linux_bzr_repo" It took about....15 hours to get to the point where it crashed, so be prepared to wait x.x

Tags: memory
Revision history for this message
Mark Grandi (markgrandi) wrote :
Revision history for this message
Mark Grandi (markgrandi) wrote :
Mark Grandi (markgrandi)
description: updated
Revision history for this message
Jelmer Vernooij (jelmer) wrote : Re: [Bug 864217] [NEW] MemoryError when repacking repository with large numbers of revisions

On Sat, Oct 01, 2011 at 06:11:22PM -0000, Mark Grandi wrote:
> Public bug reported:

> I was trying an experiment where I would convert the linux kernel git
> repository into a bazaar repository. However, when I was using 'fast-
> import', bazaar ran out of memory at "repacking" the repository when i
> had about 16,000 revisions out of 264,207

> Attached is the bzr log (very very long) of when i started the fast-
> import to when it crashed.

> I tried to restart the fast-import, but then I ran into this bug:
> https://bugs.launchpad.net/bzr/+bug/541626 , and now I seem unable to
> complete it because it keeps crashing with that error.

> Also attached is my dxdiag output, which someone on irc told me to add
> so it helps to diagnose the problem.

> The exact procedure I did, was I used git to clone the repository at
> "https://github.com/torvalds/linux.git", then i ran "git fast-export
> --all > linux.fe" to create the fast export file . Then using bazaar
> explorer i ran "bzr fast-import C:/Users/Mark/Desktop/tmp_linux/linux.fe
> C:/Users/Mark/Desktop/tmp_linux_bzr_repo" It took about....15 hours to
> get to the point where it crashed, so be prepared to wait x.x

> ** Affects: bzr
> Importance: Undecided
> Status: New
What version of bzr are you using?

Cheers,

Jelmer

Revision history for this message
Mark Grandi (markgrandi) wrote :

i am using bazaar version: 2.4.0 according to the log that i posted above

Revision history for this message
Martin Packman (gz) wrote :

Bazaar version in the log is 2.4 from the windows all-in-one installer, so limit on addressable memory with a 32-bit process will be one issue here.

To make this bug useful, we really need to know *where* the memory is going so blame can be assigned between bzr/bzr-fastimport/python-fastimport/whatever else. Repacking in bzr is certainly one aspect. I guess lp:meliae is not included in the installer, but could be added separately if you want to investigate further Mark.

There are a whole bunch of interesting things in the log that raise questions.

* During import the repository was repacked five times, at 40,000, 80,000, 100,000, 120,000 and 160,000 revisions, where it died. The first repack took 40 minutes, and they got slower until the last successful one that took over an hour. From the start it was reporting 264,207 revisions in total. Is all this repacking a good use of time?

* The import started out reporting over 1000 revisions processed a minute, then dropped gradually at stabilised at around 400/minute at around the 30,000 revisions mark. However, something at 39,000 revisions, but before the first repack, the log starts spamming "checking remap as size shrunk by 24 to be 25731" style messages which persists from there on, and the rate drops till only 170/minute are reported near the end. This looks like a suspect heuristic in bzrlib.chk_map that may be doing extra work for values just above the current threshold.

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 864217] Re: MemoryError when repacking repository with large numbers of revisions

Well, bzr's 'natural' pack for that would be:
2*100K
6*10K
4*1K
2*100
7*1
=======
264,207

reached by
2 100K pack operaitons
26 10K pack operations
264 1K pack operations
2642 100 rev pack operations
26420 10 rev pack operations

How this balances vs the 40K heuristic the import is using I'm not sure :)

Being packed -reasonably- tightly helps performance during the import,
but perhaps could be tuned based on whats being imported?

-Rob

Martin Pool (mbp)
Changed in bzr:
status: New → Confirmed
importance: Undecided → Medium
tags: added: memory
removed: bzr memoryerror repacking
Jelmer Vernooij (jelmer)
tags: added: check-for-breezy
Jelmer Vernooij (jelmer)
tags: removed: check-for-breezy
Changed in brz:
status: New → Triaged
importance: Undecided → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.