Вы находитесь на странице: 1из 12

Trending:

Programming Careers
Learn Java
Mobile Java
Java App Dev
Open Source Java
Development Tools
Java 101
Resources/White Papers

Search

javaworld
Sign In | Register

Core Java
o All Core Java
o Agile Development
o Java Concurrency
o Java Language
o Java Platform
o Java Security
o Programming Careers
o Testing and Debugging
Enterprise Java
o All Enterprise Java
o Big Data
o Cloud Computing
o Data Analytics
o Development Tools
o Java APIs
o Java App Dev
o Java Web Development
o Open Source
o Scripting and JVM Languages
Learn Java
o All Learn Java
o Design Patterns
Mobile Java
o All Mobile Java
o Java Android Developers
o Java iOS Developers
News
Features
How-Tos
Blogs
Resources/White Papers
Newsletters

Home
Core Java

News

Sizeof for Java


Object sizing revisited
By Vladimir Roubtsov

JavaWorld | Dec 26, 2003 12:00 AM PT

More like this

Java Tip 130: Do you know your data size?


Discover new dimensions of scripted Java
Does an object exist if you can't test its identity?

25% off YETI Hopper 20 Portable Cooler - Deal Alert

34% off SanDisk Connect Wireless Stick 200GB - Deal Alert

46% off ZeroLemon ToughJuice USB-C 30000mAh Power Bank, Dual Layer Rugged...

Latest Insider


5 rock-solid Linux distros for developers

8 big data predictions for 2017

Get started with Azure Machine Learning

See all Insider

December 26, 2003

Q: Does Java have an operator like sizeof() in C?

A: A superficial answer is that Java does not provide anything like C's sizeof(). However, let's
consider why a Java programmer might occasionally want it.

A C programmer manages most datastructure memory allocations himself, and sizeof() is


indispensable for knowing memory block sizes to allocate. Additionally, C memory allocators
like malloc() do almost nothing as far as object initialization is concerned: a programmer must
set all object fields that are pointers to further objects. But when all is said and coded, C/C++
memory allocation is quite efficient.
By comparison, Java object allocation and construction are tied together (it is impossible to use
an allocated but uninitialized object instance). If a Java class defines fields that are references to
further objects, it is also common to set them at construction time. Allocating a Java object
therefore frequently allocates numerous interconnected object instances: an object graph.
Coupled with automatic garbage collection, this is all too convenient and can make you feel like
you never have to worry about Java memory allocation details.

Of course, this works only for simple Java applications. Compared with C/C++, equivalent Java
datastructures tend to occupy more physical memory. In enterprise software development,
getting close to the maximum available virtual memory on today's 32-bit JVMs is a common
scalability constraint. Thus, a Java programmer could benefit from sizeof() or something
similar to keep an eye on whether his datastructures are getting too large or contain memory
bottlenecks. Fortunately, Java reflection allows you to write such a tool quite easily.

Before proceeding, I will dispense with some frequent but incorrect answers to this article's
question.

Fallacy: Sizeof() is not needed because Java basic types' sizes are fixed

Yes, a Java int is 32 bits in all JVMs and on all platforms, but this is only a language
specification requirement for the programmer-perceivable width of this data type. Such an int is
essentially an abstract data type and can be backed up by, say, a 64-bit physical memory word on
a 64-bit machine. The same goes for nonprimitive types: the Java language specification says
nothing about how class fields should be aligned in physical memory or that an array of booleans
couldn't be implemented as a compact bitvector inside the JVM.

SponsoredPost Sponsored by DeVry University

The Next Five Years: What to Expect From the Internet of Things

The Internet of Things (IoT) is booming, and the future of interconnectivity looks bright. Many
wonder, Whats next?

Fallacy: You can measure an object's size by serializing it into a byte stream and
looking at the resulting stream length
The reason this does not work is because the serialization layout is only a remote reflection of
the true in-memory layout. One easy way to see it is by looking at how Strings get serialized: in
memory every char is at least 2 bytes, but in serialized form Strings are UTF-8 encoded and so
any ASCII content takes half as much space.

Another working approach

You might recollect "Java Tip 130: Do You Know Your Data Size?" that described a technique
based on creating a large number of identical class instances and carefully measuring the
resulting increase in the JVM used heap size. When applicable, this idea works very well, and I
will in fact use it to bootstrap the alternate approach in this article.

Note that Java Tip 130's Sizeof class requires a quiescent JVM (so that the heap activity is only
due to object allocations and garbage collections requested by the measuring thread) and requires
a large number of identical object instances. This does not work when you want to size a single
large object (perhaps as part of a debug trace output) and especially when you want to examine
what actually made it so large.

What is an object's size?

The discussion above highlights a philosophical point: given that you usually deal with object
graphs, what is the definition of an object size? Is it just the size of the object instance you're
examining or the size of the entire data graph rooted at the object instance? The latter is what
usually matters more in practice. As you shall see, things are not always so clear-cut, but for
starters you can follow this approach:

An object instance can be (approximately) sized by totaling all of its nonstatic data fields
(including fields defined in superclasses)
Unlike, say, C++, class methods and their virtuality have no impact on the object size
Class superinterfaces have no impact on the object size (see the note at the end of this
list)
The full object size can be obtained as a closure over the entire object graph rooted at the
starting object

Note: Implementing any Java interface merely marks the class in question and does not add any
data to its definition. In fact, the JVM does not even validate that an interface implementation
provides all methods required by the interface: this is strictly the compiler's responsibility in the
current specifications.

To bootstrap the process, for primitive data types I use physical sizes as measured by Java Tip
130's Sizeof class. As it turns out, for common 32-bit JVMs a plain java.lang.Object takes
up 8 bytes, and the basic data types are usually of the least physical size that can accommodate
the language requirements (except boolean takes up a whole byte):

// java.lang.Object shell size in bytes:


public static final int OBJECT_SHELL_SIZE = 8;
public static final int OBJREF_SIZE = 4;
public static final int LONG_FIELD_SIZE = 8;
public static final int INT_FIELD_SIZE = 4;
public static final int SHORT_FIELD_SIZE = 2;
public static final int CHAR_FIELD_SIZE = 2;
public static final int BYTE_FIELD_SIZE = 1;
public static final int BOOLEAN_FIELD_SIZE = 1;
public static final int DOUBLE_FIELD_SIZE = 8;
public static final int FLOAT_FIELD_SIZE = 4;

(It is important to realize that these constants are not hardcoded forever and must be
independently measured for a given JVM.) Of course, naive totaling of object field sizes neglects
memory alignment issues in the JVM. Memory alignment does matter (as shown, for example,
for primitive array types in Java Tip 130), but I think it is unprofitable to chase after such low-
level details. Not only are such details dependent on the JVM vendor, they are not under the
programmer's control. Our objective is to obtain a good guess of the object's size and hopefully
get a clue when a class field might be redundant; or when a field should be lazily populated; or
when a more compact nested datastructure is necessary, etc. For absolute physical precision you
can always go back to the Sizeof class in Java Tip 130.

To help profile what makes up an object instance, our tool will not just compute the size but will
also build a helpful datastructure as a byproduct: a graph made up of IObjectProfileNodes:

interface IObjectProfileNode
{
Object object ();
String name ();

int size ();


int refcount ();

IObjectProfileNode parent ();


IObjectProfileNode [] children ();
IObjectProfileNode shell ();

IObjectProfileNode [] path ();


IObjectProfileNode root ();
int pathlength ();

boolean traverse (INodeFilter filter, INodeVisitor visitor);


String dump ();
} // End of interface

IObjectProfileNodes are interconnected in almost exactly the same way as the original object
graph, with IObjectProfileNode.object() returning the real object each node represents.
IObjectProfileNode.size() returns the total size (in bytes) of the object subtree rooted at that
node's object instance. If an object instance links to other objects via non-null instance fields or
via references contained inside array fields, then IObjectProfileNode.children() will be a
corresponding list of child graph nodes, sorted in decreasing size order. Conversely, for every
node other than the starting one, IObjectProfileNode.parent() returns its parent. The entire
collection of IObjectProfileNodes thus slices and dices the original object and shows how data
storage is partitioned within it. Furthermore, the graph node names are derived from the class
fields and examining a node's path within the graph (IObjectProfileNode.path()) allows you
to trace the ownership links from the original object instance to any internal piece of data.

You might have noticed while reading the previous paragraph that the idea so far still has some
ambiguity. If, while traversing the object graph, you encounter the same object instance more
than once (i.e., more than one field somewhere in the graph is pointing to it), how do you assign
its ownership (the parent pointer)? Consider this code snippet:

Object obj = new String [] {new String ("JavaWorld"),


new String ("JavaWorld")};

Each java.lang.String instance has an internal field of type char[] that is the actual string
content. The way the String copy constructor works in Java 2 Platform, Standard Edition
(J2SE) 1.4, both String instances inside the above array will share the same char[] array
containing the {'J', 'a', 'v', 'a', 'W', 'o', 'r', 'l', 'd'} character sequence. Both
strings own this array equally, so what should you do in cases like this?

If I always want to assign a single parent to a graph node, then this problem has no universally
perfect answer. However, in practice, many such object instances could be traced back to a single
"natural" parent. Such a natural sequence of links is usually shorter than the other, more
circuitous routes. Think about data pointed to by instance fields as belonging more to that
instance than to anything else. Think about entries in an array as belonging more to that array
itself. Thus, if an internal object instance can be reached via several paths, we choose the shortest
path. If we have several paths of equal lengths, well, we just pick the first discovered one. In the
worst case, this is as good a generic strategy as any.

Thinking about graph traversals and shortest paths should ring a bell at this point: breadth-first
search is a graph traversal algorithm that guarantees to find the shortest path from the starting
node to any other reachable graph node.

After all these preliminaries, here is a textbook implementation of such a graph traversal. (Some
details and auxiliary methods have been omitted; see this article's download for full details.):

Related:

Core Java
Learn Java
Java Language
Memory

1 2 3 Next
Recommended


Eclipse, NetBeans, or IntelliJ? Choose your Java IDE

Android Studio for beginners: Code the app

Immutable empty collections and iterators

Open source Java projects: Docker Swarm

Video/Webcast
Sponsored

Building Cognitive IoT-Robotics-Mobile Messaging with Java, Watson and MobileFirst


on Bluemix

Extending the value of your Java applications means more than just moving to the cloud.
Business...

White Paper

Coding with JRebel: Java Forever Changed

Notice to our Readers


We're now using social media to take your comments and feedback. Learn more about this here.
Popular on JavaWorld

Eclipse, NetBeans, or IntelliJ? Choose your Java IDE

Find out what to look for in a Java IDE and get tips for deciding which of the top three--
Eclipse,...

Android Studio for beginners: Code the app


Open source Java projects: Docker Swarm

Newsletters
Stay up to date on the latest tutorials and Java community news posted on JavaWorld

Get our Enterprise Java newsletter

Lightning fast NoSQL with Spring Data Redis

Redis isn't your typical NoSQL data store, and that's exactly why it hits the sweet spot for
certain...

Choosing your Java IDE

Find out what to look for in a Java IDE and get tips for deciding which of the top three--
Eclipse,...

Popular Resources

Building Cognitive IoT-Robotics-Mobile Messaging with Java, Watson and MobileFirst


on Bluemix
IBM Bluemix - From Idea to Application
Microservices Without the Hassle of Infrastructure
OpenWhisk Hello World Demonstration
PointSource Shortens Scrum Meetings by 50 Percent with IBM DevOps Services for
Bluemix

Featured Stories

Oracle to Java devs: Stop signing JAR files with MD5

Starting in April, if a JAR file is signed with MD5, Oracle will treat it as unsigned

Attackers start wiping data from CouchDB and Hadoop databases


Researchers are reporting data wiping attacks hitting openly accessible Hadoop and CouchDB
deployments....

US alleges systemic employment discrimination at Oracle

The U.S. government says Oracle routinely and systemically pays white men more than women
and...

AI should enhance, not replace, humans, say CEOs of IBM and Microsoft

Artificial intelligence should enhance human workers, not replace them, at least according to the
CEOs...

JavaWorld JavaWorld is the original independent resource for Java developers, architects, and
managers. Follow us

Core Java
Enterprise Java
Learn Java
Mobile Java

How-Tos
Features
News
Blogs
Resources
Newsletters

About Us
Contact
Privacy Policy
Advertising
Careers at IDG
Site Map
Ad Choices
E-commerce Affiliate Relationships
Copyright 2017 IDG Communications, Inc.

Explore the IDG Network descend

Вам также может понравиться