[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Memory allocation performance
- To: Julian Elischer <julian_(_at_)_elischer_(_dot_)_org>
- Subject: Re: Memory allocation performance
- From: Alexander Motin <mav_(_at_)_FreeBSD_(_dot_)_org>
- Date: Fri, 01 Feb 2008 08:56:34 +0200
- Cc: freebsd-hackers_(_at_)_freebsd_(_dot_)_org, freebsd-performance_(_at_)_freebsd_(_dot_)_org
Julian Elischer пишет:
Alexander Motin wrote:
While profiling netgraph operation on UP HEAD router I have found that
huge amount of time it spent on memory allocation/deallocation:
0.14 0.05 132119/545292 ip_forward <cycle 1> 
0.14 0.05 133127/545292 fxp_add_rfabuf 
0.27 0.10 266236/545292 ng_package_data 
14.1 0.56 0.21 545292 uma_zalloc_arg 
0.17 0.00 545292/1733401 critical_exit <cycle 2> 
0.01 0.00 275941/679675 generic_bzero 
0.01 0.00 133127/133127 mb_ctor_pack 
0.15 0.06 133100/545266 mb_free_ext 
0.15 0.06 133121/545266 m_freem 
0.29 0.11 266236/545266 ng_free_item 
15.2 0.60 0.23 545266 uma_zfree_arg 
0.17 0.00 545266/1733401 critical_exit <cycle 2> 
0.00 0.04 133100/133100 mb_dtor_pack 
0.00 0.00 134121/134121 mb_dtor_mbuf 
I have already optimized all possible allocation calls and those that
left are practically unavoidable. But even after this kgmon tells that
30% of CPU time consumed by memory management.
So I have some questions:
1) Is it real situation or just profiler mistake?
2) If it is real then why UMA is so slow? I have tried to replace it
in some places with preallocated TAILQ of required memory blocks
protected by mutex and according to profiler I have got _much_ better
results. Will it be a good practice to replace relatively small UMA
zones with preallocated queue to avoid part of UMA calls?
3) I have seen that UMA does some kind of CPU cache affinity, but does
it cost so much that it costs 30% CPU time on UP router?
given this information, I would add an 'item cache' in ng_base.c
(hmm do I already have one?)
That was actually my second question. As there is only 512 items by
default and they are small in size I can easily preallocate them all on
boot. But is it a good way? Why UMA can't do just the same when I have
created zone with specified element size and maximum number of objects?
What is the principal difference?
freebsd-performance_(_at_)_freebsd_(_dot_)_org mailing list
To unsubscribe, send any mail to "freebsd-performance-unsubscribe_(_at_)_freebsd_(_dot_)_org"