Вы находитесь на странице: 1из 46

Oak

the architecture of Apache Jackrabbit 3


Subsection Title
• Text
• Text
Resources
• http://jackrabbit.apache.org/oak/
• Docs
• http://jackrabbit.apache.org/oak/docs/
• Code
• https://svn.apache.org/repos/asf/jackrabbit/oak/trunk/
• https://github.com/apache/jackrabbit-oak
• Builds
• http://ci.apache.org/builders/oak-trunk/
• https://travis-ci.org/apache/jackrabbit-oak
Outline
• Tree model
• Updating the tree
• Refresh and garbage collection
• Concurrency and conflicts
• Interlude: Implementations
• Replicas and sharding
• Access control
• Comparing revisions
• Commit hooks
• Observers
• Search
• Big picture
Tree model
a d Paths as identifiers
/
/a
b c /a/b
/a/c
/d
a d Paths as identifiers
/
/a
b c /a/b
/a/c
/d
Updating the tree
?
HEAD
r1 r2

r1: /d r2: /d

r1: /a/c
r2: /a/c
Refresh and garbage collection
refresh
garbage
Concurrency and conflicts
r2a r1 r2b
r2a

merge
r1 r3

r2b
Conflict handling strategies
a. Fully serialized commits
• fail on conflict, no concurrent updates
b. Partially serialized commits
• fail on conflict, concurrent conflict-free updates
c. Partial merge logic
• conflict markers, manual conflict resolution
d. Full merge logic
• conflicting changes may be lost
Interlude: implementations
MicroKernel/NodeStore
• Implementation of the tree/revision model
Responsible for Not responsible for
Clustering Type validation
Sharding Access control
Caching Search
Conflict handling Versioning
etc. etc.
Current implementations
DocumentMK TarMK (SegmentMK)
Persistence backends MongoDB, JDBC (WIP) Local FS (tar files)
Conflict handling Partial serialization Full serialization
Clustering MongoDB clustering Simple failover
Sharding MongoDB sharding N/A
Single-node performance Moderate High
Key use cases Large deployments (>1TB), Small/medium deployments,
concurrent writes mostly read
Replicas and sharding
Replicas and caches

master copy full replica cache


Sharding strategies

by path by level by hash with caching


Access control
Accessible paths
/
/a/b
/d
Existentialism
• All (syntactically valid) paths can be traversed
• But the identified node might not exist
• For example:
root.getChildNode(“a”).exists() -> false
root.getChildNode(“a”).getChildNode(“b”).exists() -> true!

• Implemented as a decorator over the MK


Comparing revisions
What changed?
Content diff
• Tells what changed between two content trees
• Cornerstone of most higher-level functionality
• validation
• indexing
• observation
• etc.
Examples
r1 -> r2a r1 -> r3 r2a
“a” modified “a” modified
“b” removed “b” removed r1 r3

“d” modified
r1 -> r2b “e” added r2b

“d” modified
“e” added
Commit hooks
If this changed, commit this instead
Commit hooks
• Based on given before and after states, a hook can:
• fail the commit, or
• pass the commit unmodified, or
• pass the commit with modifications
• Key plugin mechanism in Oak
• All configured hooks are applied in sequence
• Used for much higher level functionality
• Often implemented using a content diff
Examples
• All kinds of validation
• node types, access control, references, etc.
• Trigger-like functionality
• autocreated content, default values, etc.
• In-content index updates
• etc.
Types of hooks
CommitHook Editor Validator

Content diff Optional Always Always

Can modify commit Yes Yes No

Programming Simple Callbacks Callbacks


model
Performance High Medium Low
impact
Observers
Observers
• Based on given before and after states, an observer can:
• observe what changed in the content tree
• Invoked after the commit, unlike commit hooks
• Always asynchronous for changes from other cluster
nodes
• Depending on backend, can be synchronous for
changes on the local cluster node
• Often implemented using a content diff
Examples
• JCR Observation
• External index updates
• Cache invalidation
• Logging
• etc.
Search
Query engine
SELECT
Parser Index
WHERE x=y

/a//*

Parser Index

Parser Index
Parser Index
Query processing steps
1. Parsing
a. Select matching parser
b. Parse the query string
2. Execution
a. Estimate cost per index
b. Select index with the least cost estimate
c. Execute the query against the index
3. Post-processing
a. Filter results on access control and additional constraints
b. Apply sorting, grouping, faceting, etc.
Index implementations
• Property index
• Reference index
• Lucene index
• in-content
• local file system
• Solr index
• embedded
• external
Big picture
JCR API

Oak JCR

Oak API

Oak Core Plugins

NodeStore API

MicroKernel
Questions?

Вам также может понравиться