mjd-perl-pm on Tue, 9 Apr 2002 01:46:33 -0400


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

How big is an array?


------- Forwarded Message
Subject: Data sizes
Date: Tue, 09 Apr 2002 01:37:46 -0400
From: Mark-Jason Dominus <mjd@plover.com>

I peered into the .h files again today and figured out the sizes of
the structures.  I wasn't too careful about looking everything up, so
some may be a little off, but probably not by more than a few bytes.

The short summary is:

        Typical scalar: 16-32 bytes plus data size
Object based on scalar: 40 + 16 per reference
                 Array: 49 +           4 bytes per element + size of elements
                  Hash: 56 + 4 per bucket + 21 per element + size of elements

So for example the string "BLARF" probably takes about 30 bytes.  (24
overhead plus 6 for "BLARF\0")

Typically the number of hash buckets will be approximately equal to
the number of elements, so a rough estimate for hashes is 56 + 25 per
element.

Here's a more complete listing:

base sv: 12
  rv (reference):                 16
  pv (string):                    24 + string data
  pviv (integer):                 28 + string data
  pvuv:                           28 + string data
  pvnv (floating-point):          32 + string data
  pvmg (magical or blessed):      40 + magic structures
    typical blessed scalar:       40
  pvlv (lvalue):                  56 + magic structures
  pvgv (glob):                    60 + magic structures + name length
                                     + gp (= 48)
  pvbm (Boyer-Moore-ized string)  47 + magic structures
  pvfm (format)                   88 + magic structures + code size
  pvio (file/dirhandle)           96 + magic structures + PerlIO (+ PerlIO)

array:
  pvav:                           49 + 4 per element + elements

hash:
  pvhv:                           56 + 4 per bucket + 21 per element + elements
                                  (at least 8 buckets)

magic structures:                 52 each plus magic data

(Most scalars won't have any magic structures.  The ones that
represent magic variables or tied or overloaded values will typically
have one magic structure each.  Ordinary blessed scalars have no magic
structures.)

subroutine:
  pvcv (subroutine)               84 + magic structures + pad (AV) + op tree

ops (each): 
  op (null, pushmark, padsv):     24
  unop (rv2sv, refgen, preinc):   28
  binop (add, lt, bit_or, aelem): 32
  logop (regcomp, and, mapwhile): 32
  listop (substr, aslice, index): 32
  pmop (subst, match, qr):        57 + regex
  padop: (const, aelemfast,
         anoncode, method_named): 28
  pvop: (trans)                   28
  loop: (enteriter, enterloop):   44

regex: 4 + 4 per regex node + literal string data + 32 per charclass

CAVEATS: Values range from 'approximate' to 'highly approximate' to
'merely inaccurate'.  Values depend on all sorts of stuff, such as:
        What size is an int?
        Did you compile with 64-bit support?
        Did you compile with USE5005THREADS?
        What exactly is in the structures?
        Etc.

In particular, note that if your program has 1,000,000 arrays of five
items each, you can expect to save about 49 megabytes if you can
transpose it to use five arrays of 1,000,000 items each.

------- End of Forwarded Message

Hope you find this enlightening or at least interesting.

**Majordomo list services provided by PANIX <URL:http://www.panix.com>**
**To Unsubscribe, send "unsubscribe phl" to majordomo@lists.pm.org**