[PC-BSD Testing] upgrading.. slower than needs be
kris at pcbsd.org
Tue Dec 6 07:36:51 PST 2011
On 12/06/2011 01:42, Julian Elischer wrote:
> On 12/5/11 10:18 AM, Julian Elischer wrote:
>> On 12/5/11 8:02 AM, Kris Moore wrote:
>>> On 12/01/2011 21:06, Julian Elischer wrote:
>>>> On 12/1/11 5:03 PM, Julian Elischer wrote:
>>>>> Watching the RC1->RC2 upgrade on my nice shiny new 15Mb/sec internet
>>>>> link, I now see that teh upgrade process does one file at a time.
>>>>> it would be a lot faster to overlap the downloads if it were
>>>>> possible. Does the upgrade process know of more than one file that it
>>>>> needs to get?
>>>> One could at least make the fetch of the checksum file be done in
>>>> parallel, so that the tcp handshake is done during the transfer of the
>>>> main file.
>>>> Looking at my system upgrading however it looks as though there is
>>>> something else that should be found too.
>>>> It's 95 % idle, yet it is only asking for each file, one at a time
>>>> with large gaps between them.(10 seconds)?
>>>> Disk I doesn't seem to be a bottleneck either so I can't work out
>>>> what the holdup is.
>>> I'll see what we can do to speed this up :) Parallel downloads would
>>> help for sure, if anything we can do the md5 at the same time.
>> unfortunately I have now upgraded so I can't do it again to test, but
>> I could se no reason as to why it was so slow.
>> The disk was barely being touched, and there was only 5% cpu usage,
>> and teh network IO was only sporadic.. it would do aburst of high
>> bandwidth where it downloaded the file, then there would be a blip
>> where it did someting on cpu then a 5 second gap of nothing then the
>> md 5 would be received and then another 5 seconds of nothing much,
>> and then the cycle would repeat.
>> I just realised that I took a ktrace of the upgrade and that shoudl
>> still be there. heh.. maybe that has some clues..
> yeah like:
> there appears to be a 2 second sleep in there?
> 87555 sleep 1322793431.245036 RET sigprocmask 0
> 87555 sleep 1322793431.245155 CALL
> 87555 sleep 1322793431.245178 RET sigaction 0
> 87555 sleep 1322793431.245200 CALL
> 46438 sh 1322793432.347563 RET wait4 87543/0x155f7
> 46438 sh 1322793432.347623 CALL
> 87555 sleep 1322793433.247147 RET nanosleep 0
> 87555 sleep 1322793433.247467 CALL
> 87555 sleep 1322793433.247493 RET sigprocmask 0
> and it's calling 'ps' for gods sake;
> 87554 sh 1322793431.229882 RET close 0
> 87554 sh 1322793431.229953 CALL
> 87554 sh 1322793431.230051 NAMI "/bin/ps"
> 87554 sh 1322793431.230233 NAMI "/libexec/ld-elf.so.1"
> 87554 ps 1322793431.230350 RET execve 0
> and it sleeps in parallel with fetching??
> 87632 fetch 1322793440.826567 RET access -1 errno 2 No such file
> or directory
> 87633 sleep 1322793440.826575 RET pread 4096/0x1000
> and between bursts of activity it seems to just sit around calling tr,
> cut and du amongst others
> 87615 tr 1322793440.459914 NAMI "/lib/libc.so.7"
> 87614 du 1322793440.459939 CALL pread(0x3,0x80083c920,0x1000,0)
> 87616 cut 1322793440.459806 RET mmap 34366263296/0x800639000
> and sleep again:
> 46438 sh 1322793438.855010 CALL
> 87606 sleep 1322793440.453143 RET nanosleep 0
> oh, and ps a few more times..
> which seems to just be returning:
> 87604 ps 1322793438.447099 GIO fd 1 wrote 112 bytes
> " PID TT STAT TIME COMMAND
> 87603 ?? S 0:00.00 /bin/sh
> /usr/local/bin/pc-updatemanager install release
> 87604 ps 1322793438.447147 RET write 112/0x70
> 87604 ps 1322793438.447197 CALL exit(0)
> and fetch seems to read stuff in 1 char at a time, with gettimeofday()
> for each read..
> 87605 fetch 1322793438.532712 GIO fd 3 read 1 byte
> 87605 fetch 1322793438.532733 RET read 1
> 87605 fetch 1322793438.532754 CALL gettimeofday(0x7fffffffcd40,0)
> 87605 fetch 1322793438.532774 RET gettimeofday 0
> 87605 fetch 1322793438.532795 CALL read(0x3,0x7fffffffcdaf,0x1)
> 87605 fetch 1322793438.532818 GIO fd 3 read 1 byte
> " "
> 87605 fetch 1322793438.532839 RET read 1
> 87605 fetch 1322793438.532860 CALL gettimeofday(0x7fffffffcd40,0)
> well you get the idea...
> it looks like whatever shell script is doing the upgrade could do
> with a bit of love.
The "sleep" while fetching is normal. We run fetch as a background task,
and run in a tight loop to echo / display the progress in a manner which
the GUI can parse. (Fetch output isn't so nice for this) So those
sleep/ps/fetch calls won't be causing any slowdown to the download.
If you grab the latest from SVN, it will run a lot faster. We had some
other sleeps in there which have been removed, now it'll just run right
through the downloads, only stopping when a fetch error occurs.
More information about the Testing