smooks - 源码 - 源码 - 免费下载

smooks
文件大小： unknow
源码售价： 5 个金币积分规则积分充值
资源说明：Extensible data integration Java framework for building XML and non-XML fragment-based applications
= Smooks
:toc: macro
:!toc-title:
:toclevels:3

image:https://img.shields.io/maven-central/v/org.smooks/smooks[Maven Central]
image:https://img.shields.io/nexus/s/org.smooks/smooks?server=https%3A%2F%2Foss.sonatype.org[Sonatype Nexus (Snapshots)]
image:https://github.com/smooks/smooks/workflows/CI/badge.svg[Build Status]

This is the Git source code repository for the http://www.smooks.org[Smooks] project.

toc::[]

== Building

=== Prerequisites

* JDK 8
* Apache Maven 3.2.x

=== Maven

. `git clone git://github.com/smooks/smooks.git`
. `cd smooks`
. `mvn clean install`

NOTE: You will need both Maven (version 3.2.x) and Git installed on your local machine.

=== Docker

You can also build from the https://www.docker.io[Docker] image:

. https://www.docker.io/gettingstarted/[Install Docker].
. Run `sudo docker build -t smooks github.com/smooks/smooks`. This will create a Docker image named _smooks_ that contains the correct build environment and a clone of this Git repo.
. Run `sudo docker run -i smooks mvn clean install` to build the source code.

// tag::getting-started[]
== Getting Started

The easiest way to get started with Smooks is to download and try out the https://github.com/smooks/smooks-examples/tree/v1.0.2[examples]. The examples are the recommended base upon which to integrate Smooks into your application.
// end::getting-started[]

// tag::introduction[]
== Introduction

Smooks is an extensible Java framework for building XML and non-XML data (CSV, EDI, POJOs, etc...) fragment-based applications. It can be used as a lightweight framework on which to hook your own processing logic for a wide range of data formats but, out-of-the-box, Smooks ships with features that can be used individually or seamlessly together:

* *Java Binding*: Populate POJOs from a source (CSV, EDI, XML, POJOs, etc...). Populated POJOs can either be the final result of a transformation, or serve as a bridge for further transformations like what is seen in template resources which generate textual results such as XML. Additionally, Smooks supports collections (maps and lists of typed data) that can be referenced from expression languages and templates.

* *Transformation*: perform a wide range of data transformations and mappings. XML to XML, CSV to XML, EDI to XML, XML to EDI, XML to CSV, POJO to XML, POJO to EDI, POJO to CSV, etc...

* *Templating*: extensible template-driven transformations, with support for https://www.w3.org/TR/xslt/[XSLT], https://freemarker.apache.org/[FreeMarker], and https://www.stringtemplate.org/[StringTemplate].

* *Huge Message Processing*: process huge messages (gigabytes!). Split, transform and route fragments to JMS, filesystem, database, and other destinations.

* *Fragment Enrichment*: enrich fragments with data from a database or other data sources.

* *Complex Fragment Validation*: rule-based fragment validation.

* *Fragment Persistence*: read fragments from, and save fragments to, a database with either JDBC, persistence frameworks (like MyBatis, Hibernate, or any JPA compatible framework), or DAOs.

* *Combine*: leverage Smooks's transformation, routing and persistence functionality for _Extract Transform Load_ (ETL) operations.

* *Validation*: perform basic or complex validation on fragment content. This is more than simple type/value-range validation.

=== Why Smooks?

Smooks was conceived to perform _fragment-based transformations_ on messages. Supporting fragment-based transformation opened up the possibility of mixing and matching different technologies within the context of a single transformation. This meant that one could leverage distinct technologies for transforming fragments, depending on the type of transformation required by the fragment in question.

In the process of evolving this fragment-based transformation solution, it dawned on us that we were establishing a fragment-based processing paradigm. Concretely, a framework was being built for targeting custom link:#visitors[visitor] logic at message fragments. A visitor does not need to be restricted to transformation. A visitor could be implemented to apply all sorts of operations on fragments, and therefore, the message as a whole.

Smooks supports a wide range of data structures - XML, EDI, JSON, CSV, POJOs (POJO to POJO!). A pluggable reader interface allows you to plug in a reader implementation for any data format.

=== Fragment-Based Processing

The primary design goal of Smooks is to provide a framework that isolates and processes fragments in structured data (XML and non-XML) using existing data processing technologies (such as XSLT, plain vanilla Java, Groovy script).

A visitor targets a fragment with the visitor's resource _selector_ value. The targeted fragment can take in as much or as little of the source stream as you like. A fragment is identified by the name of the node enclosing the fragment. You can target the whole stream using the node name of the root node as the selector or through the reserved `+#document+` selector.

NOTE: The terms _fragment_ and _node_ denote different meanings. It is usually acceptable to use the terms interchangeably because the difference is subtle and, more often than not, irrelevant. A _node_ may be the outer node of a fragment, excluding the child nodes. A _fragment_ is the outer node and all its child nodes along with their character nodes (text, etc...). When a visitor targets a node, it typically means that the visitor can only process the fragment's outer node as opposed to the fragment as a whole, that is, the outer node and its child nodes

=== What's new in Smooks 2?

Smooks 2 introduces the DFDL cartridge and revamps its EDI cartridge, while dropping support for Java 7 along with a few other notable breaking changes:

* DFDL cartridge
    ** DFDL is a specification for describing file formats in XML. The DFDL cartridge leverages https://daffodil.apache.org/[Apache Daffodil] to parse files and unparse XML. This opens up Smooks to a wide array of data formats like SWIFT, ISO8583, HL7, and many more.
* Pipeline support
    ** Compose any series of transformations on an event outside the main execution context before directing the pipeline output to the execution result stream or to other destinations
* Complete overhaul of the EDI cartridge
    ** Rewritten to extend the DFDL cartridge and provide much better support for reading EDI documents
    ** Added functionality to serialize EDI documents
    ** As in previous Smooks versions, incorporated special support for EDIFACT
* SAX NG filter
    ** Replaces SAX filter and supersedes DOM filter
    ** Brings with it a new visitor API which unifies the SAX and DOM visitor APIs
    ** Cartridges migrated to SAX NG
    ** Supports XSLT and StringTemplate resources unlike the legacy SAX filter
* Mementos: a convenient way to stash and un-stash a visitor's state during its execution lifecycle
* Independent release cycles for all cartridges and one https://www.smooks.org/v2/maven[Maven BOM] (bill of materials) to track them all
* License change
    ** After reaching consensus among our code contributors, we've dual-licensed Smooks under https://choosealicense.com/licenses/lgpl-3.0/[LGPL v3.0] and https://choosealicense.com/licenses/apache-2.0/[Apache License 2.0]. This license change keeps Smooks open source while adopting a permissive stance to modifications.
* New Smooks XSD schema (`+xmlns="https://www.smooks.org/xsd/smooks-2.0.xsd"+`)
    ** Uniform XML namespace declarations: dropped `+default-selector-namespace+` and `+selector-namespace+` XML attributes in favour of declaring namespaces within the standard `+xmlns+` attribute from the `+smooks-resource-config+` element.
    ** Removed `+default-selector+` attribute from `+smooks-resource-config+` element: selectors need to be set explicitly
* Dropped Smooks-specific annotations in favour of JSR annotations
    ** Farewell `+@ConfigParam+`, `+@Config+`, `+@AppContext+`, and `+@StreamResultWriter+`. Welcome `+@Inject+`.
    ** Farewell `+@Initialize+` and `+@Uninitialize+`. Welcome `+@PostConstruct+` and `+@PreDestroy+`.
* Separate top-level Java namespaces for API and implementation to provide a cleaner and more intuitive package structure: API interfaces and internal classes were relocated to `+org.smooks.api+` and `+org.smooks.engine+` respectively
* Improved XPath support for resource selectors
    ** Functions like `not()` are now supported
* Numerous dependency updates
* Maven coordinates change: we are now publishing Smooks artifacts under Maven group IDs prefixed with `+org.smooks+`
* Replaced default SAX parser implementation from Apache Xerces to https://github.com/FasterXML/woodstox[FasterXML's Woodstox]: benchmarks consistently showed Woodstox outperforming Xerces

=== Migrating from Smooks 1.7 to 2.0

. Smooks 2 no longer supports Java 7. Your application needs to be compiled to at least Java 8 to run Smooks 2.
. Replace references to Java packages `org.milyn` with `org.smooks.api`, `org.smooks.engine`, `org.smooks.io` or `org.smooks.support`.
. Inherit from `org.smooks.api.resource.visitor.sax.ng.SaxNgVisitor` instead of `org.milyn.delivery.sax.SAXVisitor`.
. Change legacy document root fragment selectors from `$document` to `#document`.
. Replace Smooks Maven coordinates to match the coordinates as described in the https://www.smooks.org/v2/maven[Maven guide].
. Replace `ExecutionContext#isDefaultSerializationOn()` method calls with
`ExecutionContext#getContentDeliveryRuntime().getDeliveryConfig().isDefaultSerializationOn()`.
. Replace `ExecutionContext#getContext()` method calls with`+ExecutionContext#getApplicationContext()+`.
. Replace `org.smooks.delivery.dom.serialize.SerializationVisitor` references with `org.smooks.api.resource.visitor.SerializerVisitor`.
. Replace `org.smooks.cdr.annotation.AppContext` annotations with `javax.inject.Inject` annotations.
. Replace `org.smooks.cdr.annotation.ConfigParam` annotations with `javax.inject.Inject` annotations:
    * Substitute the `@ConfigParam` name attribute with the `@javax.inject.Named` annotation.
    * Wrap `java.util.Optional` around the field to mimic the behaviour of the `@ConfigParam` optional attribute.
. Replace `org.smooks.delivery.annotation.Initialize` annotations with `javax.annotation.PostConstruct` annotations.
. Replace `org.smooks.delivery.annotation.Uninitialize` annotations with `javax.annotation.PreDestroy` annotations.
. Replace references to `org.smooks.javabean.DataDecode` with `org.smooks.api.converter.TypeConverterFactory`.
. Replace references to `org.smooks.cdr.annotation.Configurator` with `org.smooks.api.lifecycle.LifecycleManager`.
. Replace references to `org.smooks.javabean.DataDecoderException` with `org.smooks.api.converter.TypeConverterException`.
. Replace references to `org.smooks.cdr.SmooksResourceConfigurationStore` with `org.smooks.api.Registry`.
. Replace references to `org.milyn.cdr.SmooksResourceConfiguration` with `org.smooks.api.resource.config.ResourceConfig`.
. Replace references to `org.milyn.delivery.sax.SAXToXMLWriter` with `org.smooks.io.DomSerializer`.

=== FAQs

See the https://www.smooks.org/v2/faq[FAQ].

=== Maven

See the https://www.smooks.org/v2/maven[Maven guide] for details on how to integrate Smooks into your project via Maven.
// end::introduction[]

// tag::fundamentals[]
== Fundamentals

A commonly accepted definition of Smooks is of it being a "Transformation Engine". Nonetheless, at its core, Smooks makes no reference to _data transformation_. The core codebase is designed to hook visitor logic into an event stream produced from a source of some kind. As such, in its most distilled form, Smooks is a _Structured Data Event Stream Processor_.

An application of a structured data event processor is transformation. In implementation terms, a Smooks transformation solution is a visitor reading the event stream from a source to produce a different representation of the input. However, Smooks's core capabilities enable much more than transformation. A range of other solutions can be implemented based on the fragment-based processing model:

* *Java Binding*: population of a POJO from the source.

* *Splitting & Routing*: perform complex splitting and routing operations on the source stream, including routing data in different formats (XML, EDI, CSV, POJO, etc...) to multiple destinations concurrently.

* *Huge Message Processing*: declaratively consume (transform, or split and route) huge messages without writing boilerplate code.

=== Basic Processing Model

Smooks's fundamental behaviour is to take an input _source_, such as XML, and from it generate an _event stream_ to which _visitors_ are applied to produce a _result_ such as EDI.

Several sources and result types are supported which equate to different transformation types, including but not limited to:

* XML to XML
* XML to POJO
* POJO to XML
* POJO to POJO
* EDI to XML
* EDI to POJO
* POJO to EDI
* CSV to XML
* CSV to ...
* ... to ...

Smooks maps the source to the result with the help of a highly-tunable SAX event model. The hierarchical events generated from an XML source (_startElement_, _endElement_, etc...) drive the SAX event model though the event model can be just as easily applied to other structured data sources (EDI, CSV, POJO, etc...). The most important events are typically the _before_ and _after_ visit events. The following illustration conveys the hierarchical nature of these events.

image:docs/images/Event-model.gif[Image:event-model.gif]

=== Hello World App

One or more of https://www.smooks.org/v2/javadoc/v2.0.0-M3/smooks/org/smooks/api/resource/visitor/sax/ng/SaxNgVisitor.html[SaxNgVisitor] interfaces need to be implemented in order to consume the SAX event stream produced from the source, depending on which events are of interest.

The following is a hello world app demonstrating how to implement a visitor that is fired on the `+visitBefore+` and `+visitAfter+` events of a targeted node in the event stream. In this case, Smooks configures the visitor to target element `+foo+`:

image:docs/images/Simple-example.png[Image:simple-example.png]

The visitor implementation is straightforward: one method implementation per event. As shown above, a Smooks config (more about `+resource-config+` later on) is written to target the visitor at a node's `+visitBefore+` and `+visitAfter+` events.

The Java code executing the hello world app is a two-liner:

[source,java]
----
Smooks smooks = new Smooks("/smooks/echo-example.xml");
smooks.filterSource(new StreamSource(inputStream));
----

Observe that in this case the program does not produce a result. The program does not even interact with the filtering process in any way because it does not provide an https://www.smooks.org/v2/javadoc/v2.0.0-M3/smooks/org/smooks/api/ExecutionContext.html[`+ExecutionContext+`] to https://www.smooks.org/v2/javadoc/v2.0.0-M3/smooks/org/smooks/Smooks.html[`+smooks.filterSource(...)+`].

This example illustrated the lower level mechanics of the Smooks's programming model. In reality, most users are not going to want to solve their problems at this level of detail. Smooks ships with substantial pre-built functionality, that is, pre-built visitors. Visitors are bundled based on functionality: these bundles are called _Cartridges_.

=== Smooks Resources

A Smooks execution consumes an source of one form or another (XML, EDI, POJO, JSON, CSV, etc...), and from it, generates an event stream that fires different visitors (Java, Groovy, DFDL, XSLT, etc...). The goal of this process can be to produce a new result stream in a different format (data transformation), bind data from the source to POJOs and produce a populated Java object graph (Java binding), produce many fragments (splitting), and so on.

At its core, Smooks views visitors and other abstractions as resources. A _resource_ is applied when a _selector_ matches a node in the event stream. The generality of such a processing model can be daunting from a usability perspective because resources are not tied to a particular domain. To counteract this, Smooks 1.1 introduced an _Extensible Configuration Model_ feature that allows specific resource types to be specified in the configuration using dedicated XSD namespaces of their own. Instead of having a generic resource config such as:

[source,xml]
----

    
    

----

an Extensible Configuration Model allows us to have a domain-specific resource config:

[source,xml]
----

    
    

----

When comparing the above snippets, the latter resource has:

. A more strongly typed domain specific configuration and so is easier to read,
. Auto-completion support from the user's IDE because the Smooks 1.1+ configurations are XSD-based, and
. No need set the resource type in its configuration.

==== Visitors

Central to how Smooks works is the concept of a visitor. A visitor is a Java class performing a specific task on the targeted fragment such as applying an XSLT script, binding fragment data to a POJO, validate fragments, etc...

==== Selectors

Resource selectors are another central concept in Smooks. A selector chooses the node/s a visitor should visit, as well working as a simple opaque lookup value for non-visitor logic.

When the resource is a visitor, Smooks will interpret the selector as an http://www.w3.org/TR/xpath/[XPath-like] expression. There are a number of things to be aware of:

. The order in which the XPath expression is applied is the reverse of a normal order, like what hapens in an XSLT script. Smooks inspects backwards from the targeted fragment node, as opposed to  forwards from the root node.
. Not all of the XPath specification is supported. A selector supports the following XPath syntax:
    * `+text()+` and attribute value selectors: `+a/b[text() = 'abc']+`, `+a/b[text() = 123]+`, `+a/b[@id = 'abc']+`, `+a/b[@id = 123]+`.
        ** `+text()+` is only supported on the last selector step in an expression: `+a/b[text() = 'abc']+` is legal while `+a/b[text() = 'abc']/c+` is illegal.
        ** `+text()+` is only supported on visitor implementations that implement the `+AfterVisitor+` interface *only*. If the visitor implements the `+BeforeVisitor+` or `+ChildrenVisitor+` interfaces, an error will result.
    * `+or+` & `+and+` logical operations: `+a/b[text() = 'abc' and @id = 123]+`, `+a/b[text() = 'abc' or @id = 123]+`
    * Namespaces on both the elements and attributes: `+a:order/b:address[@b:city = 'NY']+`.
+
NOTE: This requires the namespace prefix-to-URI mappings to be defined. A configuration error will result if not defined. Read the link:#namespace-declaration[namespace declaration] section for more details.
+
    * Supports `+=+` (equals), `+!=+` (not equals), `+<+` (less than), `+>+` (greater than).
    * Index selectors: `+a/b[3]+`.

==== Namespace Declaration

The `+xmlns+` attribute is used to bind a selector prefix to a namespace:

[source,xml]
----



    
        com.acme.visitors.MyCustomVisitorImpl
    


----

Alternatively, namespace prefix-to-URI mappings can be declared using the legacy core config `+namespace+` element:

[source,xml]
----



    
        
        
    

    
        com.acme.visitors.MyCustomVisitorImpl
    


----

=== Input

Smooks relies on a _Reader_ for ingesting a source and generating a SAX event stream. A reader is any class extending https://docs.oracle.com/javase/8/docs/api/org/xml/sax/XMLReader.html[`+XMLReader+`]. By default, Smooks uses the `+XMLReader+` returned from https://docs.oracle.com/javase/8/docs/api/org/xml/sax/helpers/XMLReaderFactory.html[`+XMLReaderFactory.createXMLReader()+`]. You can easily implement your own `+XMLReader+` to create a non-XML reader that generates the source event stream for Smooks to process:

[source,xml]
----



    

    


----

The `+reader+` config element is referencing a user-defined `+XMLReader+`. It can be configured with a set of handlers, features and parameters:

[source,xml]
----

    
        
        
    
    
        
        
        
        
    
    
        val1
        val2
    

----

Packaged Smooks modules, known as link:#Cartridge[cartridges], provide support for non-XML readers but, by default, Smooks expects an XML source. Omit the class name from the `+reader+` element to set features on the default XML reader:

[source,xml]
----

    
        
        
        
        
    

----

=== Output

Smooks can present output to the outside world in two ways:

. As instances of https://docs.oracle.com/javase/8/docs/api/javax/xml/transform/Result.html[`+Result+`]: client code extracts output from the `+Result+` instance after passing an empty one to `+Smooks#filterSource(...)+`.

. As side effects: during filtering, resource output is sent to web services, local storage, queues, data stores, and other locations. Events trigger the routing of fragments to external endpoints such as what happens when https://github.com/smooks/smooks-routing-cartridge/blob/master/README.adoc[splitting and routing].

Unless configured otherwise, a Smooks execution does not accumulate the input data to produce all the outputs. The reason is simple: performance! Consider a document consisting of hundreds of thousands (or millions) of orders that need to be split up and routed to different systems in different formats, based on different conditions. The only way of handing documents of these magnitudes is by streaming them.

IMPORTANT: Smooks can generate output in either, or both, of the above ways, all in a single filtering pass of the source. It does not need to filter the source multiple times in order to generate multiple outputs, critical for performance.

==== Result

A look at the Smooks API reveals that Smooks can be supplied with multiple `+Result+` instances:

[source,java]
----
public void filterSource(Source source, Result... results) throws SmooksException
----

Smooks can work with the standard JDK https://docs.oracle.com/javase/8/docs/api/javax/xml/transform/stream/StreamResult.html[`+StreamResult+`] and https://docs.oracle.com/javase/8/docs/api/javax/xml/transform/dom/DOMResult.html[`+DOMResult+`] result types, as well as the Smooks specific ones:

* https://www.smooks.org/v2/javadoc/v2.0.0-M3/smooks/org/smooks/io/payload/JavaResult.html[`+JavaResult+`]: result type for capturing the contents of the Smooks JavaBean context.

* https://www.smooks.org/v2/javadoc/v2.0.0-M3/smooks/org/smooks/io/payload/StringResult.html[`+StringResult+`]: `+StreamResult+` extension wrapping a `+StringWriter+`, useful for testing.

IMPORTANT: As yet, Smooks does not support capturing output to multiple `+Result+` instances of the same type. For example, you can specify multiple `+StreamResult+` instances in `+Smooks.filterSource(...)+` but Smooks will only output to the first `+StreamResult+` instance.

===== Stream Results

The `+StreamResult+` and `+DOMResult+` types receive special attention from Smooks. When the link:#user-content-settings[`+default.serialization.on+`] global parameter is turned on, which by default it is, Smooks serializes the stream of events to XML while filtering the source. The XML is fed to the `+Result+` instance if a `+StreamResult+` or `+DOMResult+` is passed to `+Smooks#filterSource+`.

NOTE: This is the mechanism used to perform a standard 1-input/1-xml-output character-based transformation.

==== Side Effects

Smooks is also able to generate different types of output during filtering, that is, while filtering the source event stream but before it reaches the end of the stream. A classic example of this output type is when it is used to split and route fragments to different endpoints for processing by other processes.

=== Pipeline

A pipeline is a flexible, yet simple, Smooks construct that isolates the processing of a targeted event from its main processing as well as from the processing of other pipelines. In practice, this means being able to compose any series of transformations on an event outside the main execution context before directing the pipeline output to the execution result stream or to other destinations. With pipelines, you can enrich data, rename/remove nodes, and much more.

Under the hood, a pipeline is just another instance of Smooks, made self-evident from the Smooks config element declaring a pipeline:

[source,xml]
----


   
       
           ...
       
       
           
               ...
           
       
   


----

`+core:smooks+` fires a nested Smooks execution whenever an event in the stream matches the `+filterSourceOn+` selector. The pipeline within the inner `+smooks-resource-list+` element visits the selected event and its child events. It is worth highlighting that the inner `+smooks-resource-list+` element behaves identically to the outer one, and therefore, it accepts resources like visitors, readers, and even pipelines (a pipeline within a pipeline!). Moreover, a pipeline is transparent to its nested resources: a resource’s behaviour remains the same whether it’s declared inside a pipeline or outside it.

The optional `+core:action+` element tells the nested Smooks instance what to do with the pipeline’s output. The next sections list the supported actions.

==== Inline

Merges the pipeline's output with the result stream:

[source,xml]
----
...

    
        ...
    

...
----

As described in the subsequent sections, an inline action replaces, prepends, or appends content.

===== Replace

Substitutes the selected fragment with the pipeline output:

[source,xml]
----
...

    

...
----

===== Prepend Before

Adds the output before the selector start tag:

[source,xml]
----

    

----

===== Prepend After

Adds the output after the selector start tag:

[source,xml]
----

    

----

===== Append Before

Adds the output before the selector end tag:

[source,xml]
----

    

----

===== Append After

Adds the output after the selector end tag:

[source,xml]
----

    

----

==== Bind To

Binds the output to the execution context’s bean store:

[source,xml]
----
...

    

...
----

==== Output To

Directs the output to a different stream other than the result stream:

[source,xml]
----
...

    

...
----

=== Cartridge

The basic functionality of Smooks can be extended through the development of a Smooks cartridge. A cartridge is a Java archive (JAR) containing reusable resources (also known as _Content Handlers_). A cartridge augments Smooks with support for a specific type input source or event handling.

Visit the https://github.com/smooks/?q=-cartridge&type=&language=&sort=[GitHub organisation page] for the complete list of Smooks cartridges.

=== Filter

A Smooks filter delivers generated events from a reader to the application's resources. Smooks 1 had the DOM and SAX filters. The DOM filter was simple to use but kept all the events in memory while the SAX filter, though more complex, delivered the events in streaming fashion. Having two filter types meant two different visitor APIs and execution paths, with all the baggage it entailed.

Smooks 2 unifies the legacy DOM and SAX filters without sacrificing convenience or performance. The new SAX NG filter drops the API distinction between DOM and SAX. Instead, the filter streams SAX events  as *partial* DOM elements to SAX NG visitors targeting the element. A SAX NG visitor can read the targeted node as well as any of the node's ancestors but not the targeted node's children or siblings in order to keep the memory footprint to a minimum.

The SAX NG filter can mimic DOM by setting its `+max.node.depth+` parameter to 0 (default value is 1), allowing each visitor to process the complete DOM tree in its `+visitAfter(...)+` method:

[source,xml]
----


    
        0
    
    ...

----

A `+max.node.depth+` value of greater than 1 will tell the filter to read and keep an node's descendants up to the desired depth. Take the following input as an example:

[source,xml]
----

    
        Joe
    
    
        
            1
            2
            8.80
        
        
            2
            2
            8.80
        
        
            3
            2
            8.80
        
    

----

Along with the config:

[source,xml]
----


    
        2
    

    
        org.acme.MyVisitor
    


----

At any given time, there will always be a single _order-item_ in memory containing _product_ because `+max.node.depth+` is 2. Each new _order-item_ overwrites the previous _order-item_ to minimise the memory footprint. `+MyVisitor#visitAfter(...)+` is invoked 3 times, each invocation corresponding to an _order-item_ fragment. The first invocation will process:

[source,xml]
----

    2

----

While the second invocation will process:

[source,xml]
----

    2

----

Whereas the last invocation will process:

[source,xml]
----

    3

----

Programmatically, implementing `+org.smooks.api.resource.visitor.sax.ng.ParameterizedVisitor+` will give you fine-grained control over the visitor's targeted element depth:

[source,java]
----
...
public class DomVisitor implements ParameterizedVisitor {

    @Override
    public void visitBefore(Element element, ExecutionContext executionContext) {
    }

    @Override
    public void visitAfter(Element element, ExecutionContext executionContext) {
        System.out.println("Element: " + XmlUtil.serialize(element, true));
    }

    @Override
    public int getMaxNodeDepth() {
        return Integer.MAX_VALUE;
    }
}
----

`+ParameterizedVisitor#getMaxNodeDepth()+` returns an integer denoting the targeted element's maximum tree depth the visitor can accept in its `+visitAfter(...)+` method.

==== Settings

Filter-specific knobs are set through the _smooks-core_ configuration namespace (`+https://www.smooks.org/xsd/smooks/smooks-core-1.6.xsd+`) introduced in Smooks 1.3:

[source,xml]
----



    
                         defaultSerialization="true" <2>
                         terminateOnException="true" <3>
                         closeSource="true" <4>
                         closeResult="true" <5>
                         rewriteEntities="true" <6>
                         readerPoolSize="3"/> <7>

    


----
<1> `+type+` (default: `+SAX NG+`): the type of processing model that will be used. `+SAX NG+` is the recommended type. The `+DOM+` type is deprecated.

<2> `+defaultSerialization+` (default: `+true+`): if default serialization should be switched on. Default serialization being turned on simply tells Smooks to locate a `+StreamResult+` (or `+DOMResult+`) in the Result objects provided to the `+Smooks.filterSource+` method and to serialize all events to that `+Result+` instance. This behavior can be turned off using this global configuration parameter and can be overridden on a per-fragment basis by targeting a visitor at that fragment that takes ownership of the `+org.smooks.io.FragmentWriter+` object.

<3> `+terminateOnException+` (default: `+true+`): whether an exception should terminate execution.

<4> `+closeSource+` (default: `+true+`): close `+Inp+` instance streams passed to the `+Smooks.filterSource+` method. The exception here is `+System.in+`, which will never be closed.

<5> `+closeResult+`: close Result streams passed to the `+[Smooks.filterSource+` method (default "true"). The exception here is `+System.out+` and `+System.err+`, which will never be closed.

<6> `+rewriteEntities+`: rewrite XML entities when reading and writing (default serialization) XML.

<7> `+readerPoolSize+`: reader Pool Size (default 0). Some Reader implementations are very expensive to create (e.g. Xerces). Pooling Reader instances (i.e. reusing) can result in a huge performance improvement, especially when processing lots of "small" messages. The default value for this setting is 0 (i.e. unpooled - a new Reader instance is created for each message). Configure in line with your applications threading model.

==== Troubleshooting

Smooks streams events that can be captured, and inspected, while in-flight or after execution. `+HtmlReportGenerator+` is one such class that inspects in-flight events to go on and generate an HTML report from the execution:

[source,java]
----
Smooks smooks = new Smooks("/smooks/smooks-transform-x.xml");
ExecutionContext executionContext = smooks.createExecutionContext();

executionContext.getContentDeliveryRuntime().addExecutionEventListener(new HtmlReportGenerator("/tmp/smooks-report.html"));
smooks.filterSource(executionContext, new StreamSource(inputStream), new StreamResult(outputStream));
----

`+HtmlReportGenerator+` is a useful tool in the developer's arsenal for diagnosing issues, or for comprehending a transformation.

An example `+HtmlReportGenerator+` report can be seen http://www.milyn.org/docs/smooks-report/report.html[online here].

Of course you can also write and use your own https://www.smooks.org/v2/javadoc/v2.0.0-M3/smooks/org/smooks/api/delivery/event/ExecutionEventListener.html[ExecutionEventListener] implementations.

CAUTION: Only use the HTMLReportGenerator in development. When enabled, the HTMLReportGenerator incurs a significant performance overhead and with large message, can even result in OutOfMemory exceptions.

==== Terminate

You can terminate Smooks's filtering before it reaches the end of a stream. The following config terminates filtering at the end of the customer fragment:

[source,xml]
----



    
    


----

The default behavior is to terminate at the end of the targeted fragment, on the `+visitAfter+` event. To terminate at the start of the targeted fragment, on the `+visitBefore+` event, set the `+terminateBefore+` attribute to `+true+`:

[source,xml]
----



    
    


----

=== Bean Context

The _Bean Context_ is a container for objects which can be accessed within during a Smooks execution. One bean context is created per execution context, that is, per `+Smooks#filterSource(...)+` operation. Provide an `+org.smooks.io.payload.JavaResult+` object to `+Smooks#filterSource(...)+` if you want the contents of the bean context to be returned at the end of the filtering process:

[source,java]
----
//Get the data to filter
StreamSource source = new StreamSource(getClass().getResourceAsStream("data.xml"));

//Create a Smooks instance (cachable)
Smooks smooks = new Smooks("smooks-config.xml");

//Create the JavaResult, which will contain the filter result after filtering
JavaResult result = new JavaResult();

//Filter the data from the source, putting the result into the JavaResult
smooks.filterSource(source, result);

//Getting the Order bean which was created by the JavaBean cartridge
Order order = (Order)result.getBean("order");
----

Resources like visitors access the bean context's beans at runtime from the `+BeanContext+`. The `+BeanContext+` is retrieved from `+ExecutionContext#getBeanContext()+`. You should first retrieve a `+BeanId+` from the `+BeanIdStore+` when adding or retrieving objects from the `+BeanContext+`. A `+BeanId+` is a special key that ensures higher performance then `+String+` keys, however `+String+` keys are also supported. The `+BeanIdStore+` must be retrieved from `+ApplicationContext#getBeanIdStore()+`. A `+BeanId+` object can be created by calling `+BeanIdStore#register(String)+`. If you know that the `+BeanId+` is already registered, then you can retrieve it by calling `+BeanIdStore#getBeanId(String)+`. `+BeanId+` is scoped at the application context. You normally register it in the `+@PostConstruct+` annotated method of your visitor implementation and then reference it as member variable from the `+visitBefore+` and `+visitAfter+` methods.

NOTE: `+BeanId+` and `+BeanIdStore+` are thread-safe.

==== Pre-installed Beans

A number of pre-installed beans are available in the bean context at runtime:

* https://www.smooks.org/v2/javadoc/v2.0.0-M3/smooks/org/smooks/engine/bean/context/preinstalled/UniqueID.html[`+PUUID+`]: This `+UniqueId+` instance provides unique identifiers for the filtering `+ExecutionContext+`.

* https://www.smooks.org/v2/javadoc/v2.0.0-M3/smooks/org/smooks/engine/bean/context/preinstalled/Time.html[`+PTIME+`]: This `+Time+` instance provides time-based data for the filtering ExecutionContext.

The following are examples of how each of these would be used in a FreeMarker template.

.Unique ID of the ExecutionContext:
....
${PUUID.execContext}
....

.Random Unique ID:
....
${PUUID.random}
....

.Filtering start time in milliseconds:
....
${PTIME.startMillis}
....

.Filtering start time in nanoseconds:
....
${PTIME.startNanos}
....

.Filtering start date:
....
${PTIME.startDate}
....

.Current time in milliseconds:
....
${PTIME.nowMillis}
....

.Current time in nanoSeconds:
....
${PTIME.nowNanos}
....

.Current date:
....
${PTIME.nowDate}
....

=== Global Configurations

Global configuration settings are, as the name implies, configuration options that can be set once and be applied to all resources in a configuration.

Smooks supports two types of globals, default properties and global parameters:

* Global Configuration Parameters: Every in a Smooks configuration can specify elements for configuration parameters. These parameter values are available at runtime through the https://www.smooks.org/v2/javadoc/v2.0.0-M3/smooks/org/smooks/api/resource/config/ResourceConfig.html[`+ResourceConfig+`], or are reflectively injected through the `+@Inject+` annotation. Global Configuration Parameters are parameters that are defined centrally (see below) and are accessible to all runtime components via the `+ExecutionContext+` (vs `+ResourceConfig+`). More on this in the following sections.

* Default Properties: Specify default values for attributes. These defaults are automatically applied to `+ResourceConfig+`s  when their corresponding does not specify the attribute. More on this in the following section.

==== Global Configuration Parameters

Global properties differ from the default properties in that they are not specified on the root element and are not automatically applied to resources.

Global parameters are specified in a `++` element:

[source,xml]
----

    param1-val

----

Global Configuration Parameters are accessible via the `+ExecutionContext+` e.g.:

[source,java]
----
public void visitAfter(Element element, ExecutionContext executionContext) {
    String param1 = executionContext.getConfigParameter("xyz.param1", "defaultValueABC");
    ....
}
----

==== Default Properties

Default properties are properties that can be set on the root element of a Smooks configuration and have them applied to all resource configurations in smooks-conf.xml file. For example, if you have a resource configuration file in which all the resource configurations have the same selector value, you could specify a `+default-target-profile=order+` to save specifying the profile on every resource configuration:

[source,xml]
----



    
        com.acme.VisitorA
        ...
    

    
        com.acme.VisitorB
        ...
    


----

The following default configuration options are available:

* `+default-target-profile*+`: Default target profile that will be applied to all resources in the smooks configuration file, where a target-profile is not defined.
* `+default-condition-ref+`: Refers to a global condition by the conditions id. This condition is applied to resources that define an empty "condition" element (i.e. ) that does not reference a globally defined condition.

=== Configuration Modularization

Smooks configurations are easily modularized through use of the `++` element. This allows you to split Smooks configurations into multiple reusable configuration files and then compose the top level configurations using the `++` element e.g.

[source,xml]
----


    
    


----

You can also inject replacement tokens into the imported configuration by using `++` sub-elements on the `++`. This allows you to make tweaks to the imported configuration.

[source,xml]
----



    
        order
    


----

[source,xml]
----



    
        .....
    


----

Note how the replacement token injection points are specified using `+@tokenname@+`.
// end::fundamentals[]

// tag::exporting-results[]
== Exporting Results

When using Smooks standalone you are in full control of the type of output that Smooks produces since you specify it by passing a certain Result to the filter method. But when integrating Smooks with other frameworks (JBossESB, Mule, Camel, and others) this needs to be specified inside the framework's configuration. Starting with version 1.4 of Smooks you can now declare the data types that Smooks produces and you can use the Smooks api to retrieve the Result(s) that Smooks exports.

To declare the type of result that Smooks produces you use the 'exports' element as shown below:

[source,xml]
----

   
      
   

----

The newly added exports element declares the results that are produced by this Smooks configuration. A exports element can contain one or more result elements. A framework that uses Smooks could then perform filtering like this:

[source,java]
----
// Get the Exported types that were configured.
Exports exports = Exports.getExports(smooks.getApplicationContext());
if (exports.hasExports())
{
    // Create the instances of the Result types.
    // (Only the types, i.e the Class type are declared in the 'type' attribute.
    Result[] results = exports.createResults();
    smooks.filterSource(executionContext, getSource(exchange), results);
    // The Results(s) will now be populate by Smooks filtering process and
    // available to the framework in question.
}
----

There might also be cases where you only want a portion of the result extracted and returned. You can use the ‘extract’ attribute to specify this:

[source,xml]
----

   
      
   

----

The extract attribute is intended to be used when you are only interested in a sub-section of a produced result. In the example above we are saying that we only want the object named orderBean to be exported. The other contents of the JavaResult will be ignored. Another example where you might want to use this kind of extracting could be when you only want a ValidationResult of a certain type, for example to only return validation errors.

Below is an example of using the extracts option from an embedded framework:

[source,java]
----
// Get the Exported types that were configured.
Exports exports = Exports.getExports(smooks.getApplicationContext());
if (exports.hasExports())
{
    // Create the instances of the Result types.
    // (Only the types, i.e the Class type are declared in the 'type' attribute.
    Result[] results = exports.createResults();
    smooks.filterSource(executionContext, getSource(exchange), results);
    List