[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Memory allocation performance
- To: Alexander Motin <mav_(_at_)_FreeBSD_(_dot_)_org>
- Subject: Re: Memory allocation performance
- From: Julian Elischer <julian_(_at_)_elischer_(_dot_)_org>
- Date: Thu, 31 Jan 2008 23:07:48 -0800
- Cc: freebsd-hackers_(_at_)_freebsd_(_dot_)_org, freebsd-performance_(_at_)_freebsd_(_dot_)_org
Alexander Motin wrote:
Julian Elischer пишет:
Alexander Motin wrote:
While profiling netgraph operation on UP HEAD router I have found
that huge amount of time it spent on memory allocation/deallocation:
0.14 0.05 132119/545292 ip_forward <cycle 1> 
0.14 0.05 133127/545292 fxp_add_rfabuf 
0.27 0.10 266236/545292 ng_package_data 
14.1 0.56 0.21 545292 uma_zalloc_arg 
0.17 0.00 545292/1733401 critical_exit <cycle 2> 
0.01 0.00 275941/679675 generic_bzero 
0.01 0.00 133127/133127 mb_ctor_pack 
0.15 0.06 133100/545266 mb_free_ext 
0.15 0.06 133121/545266 m_freem 
0.29 0.11 266236/545266 ng_free_item 
15.2 0.60 0.23 545266 uma_zfree_arg 
0.17 0.00 545266/1733401 critical_exit <cycle 2> 
0.00 0.04 133100/133100 mb_dtor_pack 
0.00 0.00 134121/134121 mb_dtor_mbuf 
I have already optimized all possible allocation calls and those that
left are practically unavoidable. But even after this kgmon tells
that 30% of CPU time consumed by memory management.
So I have some questions:
1) Is it real situation or just profiler mistake?
2) If it is real then why UMA is so slow? I have tried to replace it
in some places with preallocated TAILQ of required memory blocks
protected by mutex and according to profiler I have got _much_ better
results. Will it be a good practice to replace relatively small UMA
zones with preallocated queue to avoid part of UMA calls?
3) I have seen that UMA does some kind of CPU cache affinity, but
does it cost so much that it costs 30% CPU time on UP router?
given this information, I would add an 'item cache' in ng_base.c
(hmm do I already have one?)
That was actually my second question. As there is only 512 items by
default and they are small in size I can easily preallocate them all on
boot. But is it a good way? Why UMA can't do just the same when I have
created zone with specified element size and maximum number of objects?
What is the principal difference?
who knows what uma does.. but if you do it yourself you know what the
overhead is.. :-)
freebsd-performance_(_at_)_freebsd_(_dot_)_org mailing list
To unsubscribe, send any mail to "freebsd-performance-unsubscribe_(_at_)_freebsd_(_dot_)_org"