Frequently-Asked Questions

Click on the tabs below to toggle between the FAQs for GemFire Enterprise and GemFire Real-Time Events.

  1. What is GemFire Enterprise (GFE)?

  2. Architecture

  3. Integration with data sources

  4. Concurrency and Data Consistency

  5. Data Distribution

  6. System Configuration

  7. System Monitoring and Management

  8. Programming Model

  9. Standards and Platform Support

What is GemFire Enterprise (GFE)?

GemFire Enterprise (GFE) is a high-performance, distributed data caching platform that serves as the core component of the GemFire Enterprise Data Fabric . It supports multiple topologies that allow it to be easily embedded into application architectures such as J2EE, .NET, Grid, SOA and web portals and provide extremely high performance, throughput, and scalability. Learn more about GemFire Enterprise's features here

Architecture

Describe GFE's architecture.
GFE is an advanced operational data infrastructure that combines the capabilities of distributed caching, databases and middleware messaging.
A basic description of GFE architecture and topology can be found here. A technical white paper that describes the detailed architecture and functionality of GemFire can be downloaded here.

If you have any further questions, not addressed in this section, please send an email to developer@gemstone.com.

Can data be partitioned across nodes? If so, how is this configured? Are requests for data handled and routed automatically to the correct partition?
GFE supports partitioned data regions (a single logical region spread over multiple physical nodes) as well as non-partitioned regions within a distributed system. User-defined policies/configurations control the memory management and redundancy (for high availability) behavior of partitioned regions. By configuring the required level of redundancy, node failures can be automatically handled, as client get/put requests will automatically be redirected to a backup node. When a node holding a data partition fails, the data regions held in it are automatically mirrored to another node to ensure that the desired redundancy levels are maintained.

GemFire also provides disjoint distribution across cache nodes. i.e. each member cache manages a (either statically or dynamically configured) subset of the entire data set. Member caches pull data from other nodes on demand leaving only the data most frequently used by the application in the local cache node. This pull on-demand model is another way to partition the data.

When cache servers are in use, clients can connect to multiple instances of cache servers with data regions partitioned across these servers. The data ownership (which data regions resides where) is specified through the cache XML descriptor to each server cache. Clients connect to cache servers using a per data region XML descriptor, permitting the client to load balance requests across multiple data cache servers. The server side data regions can be replicated to other cache servers for high availability.

Can distributed caches be partitioned according to keys? Can a new partitioning algorithm be plugged in? Do partitioned keys support warm startup?
If data is statically partitioned, where each node is responsible for loading its own data, the algorithm is completely controlled by the user.
The dynamic cache partitioning (Partitioned Regions) implementation uses a hashing algorithm to assign the key to a bucket. And, buckets are automatically mapped to cache nodes. Applications have the option to specify a LRU cache on any cache node so frequently accessed data items are made available locally. Partitioned Caching does provide built-in transparent redundancy/failover.

Does GemFire support 64 bit addressing?
Yes. The product has been tested and certified for 64-bit solaris.
The product has also been tested with very large VM heaps in the Linux 64-bit environments.

Can the maximum size of the cache be configured? If so, does your solution provide paging of the cache when it's full?
Yes. Capacity controllers can be installed on any region to limit the amount of memory resource consumed by any data region. GemFire comes with multiple built-in capacity controllers. A HeapLRUCapacityController controls the contents of Region based on the percentage of Java VM heap memory that is currently being used. If the percentage of Java VM heap memory in use exceeds the given percentage, then the least recently used entry of the region is offloaded to disk. GemFire disk persistence (for disaster recovery or overflow) is highly optimized and extremely fast.

Do you provide LAN/MAN/WAN replication?
GemFire supports LAN/MAN/WAN environments through a variety of topologies and consistency policies. GemFire offers a novel model to address these topologies ranging from a single cluster all the way to replication across multiple data centers across a WAN. This model allows distributed systems to potentially scale-out in an unbounded and loosely coupled fashion, without loss of performance and data consistency. At the core of this architecture is a gateway hub/gateway model to connect and configure distributed systems/sites in a loosely coupled fashion. Each GemFire distributed system can assign a process as its gateway hub, which contains multiple gateways that connect to other distributed systems. Backup gateways and gateway hubs can be set up and configured to handle automatic fail-over.

The list of configuration options for data distribution and replication includes optimistic, pessimistic, transactional, store-and-forward guaranteed delivery, and automated management of "slow receiver" processes that prevents system-wide impact. If configured, GemFire guarantees event callback notifications in any failover/failback scenario in addition to assuring cache consistency. All these replication options are configured via the cache.xml file.

For replicating data across many cache instances in a LAN, GemFire offers the following options:

  • "Replication on demand": Data object is replicated to where it's used. (A 'PULL' model). In this model, the object resides only in the member cache that originally created it. Objects arrive to other cache members only when the connected applications request the object. The object is lazily pulled from other member caches. Once the object arrives, it will automatically receive updates to the object as long as the member cache retains interest in the object.
  • "Key replication": Only the keys of the objects cached are replicated - A 'PUSH' model. (This model can preserve network bandwidth and be used for low bandwidth networks)
  • "Total replication": All data is replicated (A 'PUSH' model)

Can the number of replicas be configured?
Yes. The selective mirroring model in GemFire also permits applications to dynamically add nodes with mirrored cache regions to increase the availability of the cached data. For instance, a application configured with just one mirror could dynamically be expanded to increase the number of mirrors by simply adding a cache member with mirroring turned ON. GemFire also supports dynamic cache partitioning, allowing you to "stripe" data across multiple processes and dynamically add capacity as well as redundancy.

Does GemFire fail-over automatically to a backup node? Does it provide fail-back to the primary node?
Yes and Yes.
There are two key logical components to deal with: failure detection and actual failover. Failover detection is managed through GemFire's Group Membership Services, which monitors all connections through various mechanisms including hearbeats. Configuration options let you tune failure-detection sensitivity by choosing timeout, retry interval, and retry count parameters. Once a failure is determined, failover is nearly instantaneous as GemFire maintains open connections to hot-backup servers at all times.
Failover/Failback is generally transparent to application logic.

Can GemFire be dynamically configured with changes in the cluster membership? Is this re-configuration automatic?
Yes, membership discovery is either handled through an IP multicast channel or through the use of a TCP/IP locator service (when MCast is not an option). New member information is automatically relayed to all members. The distribution system uses a heartbeat mechanism to monitor the health of the entire system. Sudden member departure will be notified to all the members, once the death of departure has been confirmed. Locks, connections, and other resources owned by the departed member are automatically released.

Does your solution need to be backed by a database?
No. GemFire has a built-in high-performance persistence mechanism for data regions. Any meta data required during a recovery from failure scenario is automatically obtained either from the cache XML descriptor or from other connected members. For database integration, we provide write-through and guaranteed write-behind API callback notifications where you can place O/R mapping and other database interation code.
(The native C++ version of GemFire Enterprise also provides integrated support for the use of Berkeley DB as the persistence mechanism.)

Integration with data sources

How is data loaded into GFE from remote data sources?
GemFire provides a framework for loading a cache from a remote data source. GemStone also provides packaged examples that illustrate how the cache can be populated or synchronized with a relational database using Hibernate, a OR mapping tool.

Do you provide write-through to a database? If so, is this synchronous or asynchronous?
GemFire provides a callback interface that can be programmed for write-through to a database. The write-through callback is always invoked guaranteeing consistency between a backend database and the cache. This may be synchronous or asynchronous (lazy, write-behind) with GemFire providing an HA backing persistence of the write-behind queues.

Please describe the write-through architecture. Does it rely on an O/R container (such as JDO, or EJB)?
The synchronous write-thru architecture in GemFire captures and applies the cache update operation to the database first, and IFF it succeeds applies it to the cache. The cache entry is locked while the DB synchronization occurs. Applications configure a "writer", a callback that takes a cache event and applies it to the underlying data store. A "writer" can be configured on any node in the system. A cache update operation triggered on a different node automatically triggers the remote writer first. This feature is referred to as a "netWrite".
Writers can use entity beans or straight JDBC to write to the database and automatically participate in any ongoing transaction.
GemFire provides example Cache loaders, writers and listeners that use the Hibernate OR mapping tool to synchronize with a relational database.
When using our write-behind capability, your system can continue to run even during a lengthy database outage. Updates written to the cache with write-behind activated are queued (with HA) and then written to the database on a low-priority thread.

Can GemFire provide write through to more than one database?
Cache writers can be configured per data region. Hence, Yes, multiple databases can be connected to a single cache instance.

Concurrency and Data Consistency

Does GemFire provide pessimistic and/or optimistic concurrency?
Both. Pessimistic concurrency control is provided through a distributed lock service built into the product. Transactional updates through the GemFire Transaction Manager are ACID, two-phase. optimistic, and READ_COMMITED.

Does GemFire support one-phase and two-phase (XA) transactions?
Yes. GemFire has a built-in JTA-compliant transactions manager. It may enlist other registered JTA resources in a transaction, be enlisted by container-managed transactions, or be used for GemFire-only transactions.
It should be noted that the 2-phase operation is used internally by GemFire for synchronizing changes across many cache instances participating in the transaction and involving a single external resource such as a database.

Do you provide time-to-live and eviction policy functionality?
Yes. Applications can configure TTL on memory regions and set eviction policy such as global destruction of the cache entries, local invalidation or simply spill over to disk. Besides TTL, applications can also configure eviction based on 'idleTime'- the amount of time the object may remain in the cache after last access. Subsequent access to entries that have timed out will automatically result in the entry either being recovered from disk, a different cache instance or loaded from the data source.

Do you provide a read-write facility governed by a distributed lock manager?
Yes. Data regions configured as "GLOBAL" automatically engage the distributed lock manager for all read/write operations. The locks obtained during a read can be prolonged through the use of the distributed lock service API.

Does you solution provide an API or other means by which a distributed lock manager can be utilized?
Yes. Applications can explicitly manage locks using the GemFire distributed lock service. Typically used, when the multiple locks over multiple GemFire regions are required by the pessimistic application use case.

Data Distribution

What network/transport protocols does it support/rely upon?
TCP/IP, Reliable Multicast, and Unicast.
Multicast may be used for member discovery and/or for data distribution. In all scenarios, GemFire also supports TCP/IP-only configurations.

Does GemFire provide synchronous and asynchronous replication between nodes?
GemFire provides a full menu of configuration options for data replication and distribution:
Synchronous communication without application acknowledgement
Applications that do not have very strict consistency requirements and have very low latency requirements should use synchronous communication model without acknowledgements to synchronize data across cache nodes. This is the default distribution model and provides the highest response time and throughput. Though the communication is "out-of-band", the sender cache instance makes every attempt to dispatch messages as soon as possible diminishing the probability of data
conflicts.

Synchronous communication with application acknowledgement
Regions can also be configured to do synchronous messaging with other cache members. With this configuration, the control returns back to the application only after the receiving caches have all acknowledged receipt of the message. This pessimistic mode should be used with prudence especially with increased number of cache members accessing the same region.

Synchronous communication with distributed global locking
Finally, for pessimistic application scenarios, global locks can first be obtained before sending updates to other cache members. A distributed lock service manages acquiring, releasing and timing out locks. Any region can be configured to use global locks behind the scenes through simple configuration. Applications can also explicitly request locks on cached objects if they want to prevent dirty reads on objects replicated across many nodes.

Automatic switching to asynchronous mode for slow consumers
A distribution timeout can be associated with each node, so that if a message publisher (cache)does not receive message acknowledgements within the timeout period from a consuming node, it can switch from the default synchronous communication mode to an asynchronous mode for that consumer. This kind of a switch is primarily used only for publsihing regions that do not require application acknowledgements from the consuming side. When the asynchronous communication mode is used, a producer batches messages to be sent to a consumer via a queue, the size of which is controlled either via queue timeout policy or a queue max size parameter. Events being sent to this queue can also be conflated if the receiver is interested only in the most recent value of a data entity. Once the queue is empty, the producer switches back to the synchronous distribution mode, so that message latencies are removed and cache consistency is ensured at all times.

Can the synchrony parameter (I.E. control of consistency vs. performance) of the WAN, MAN and LAN replication be configured independently?
All distribution models discussed above as well as store-and-forward queues are configured on each cache region and scoped by a cache instance. In other words, different cache instances can specify different distribution scope for the same region. So, it is possible for a remote site to configure the region with store-and-forward queue distribution, whereas the local members can use a pessimistic (synchronous) replication strategy.

How is distribution reliability controlled in GFE?
At a transport level, the usage of TCP/IP or Reliable UDP Multicast guarantees message delivery.
In addition, GFE provides a novel, declarative (user-defined) approach for managing data distribution with the required levels of reliability and consistency across several hundreds or even thousands of nodes. Application architects can define 'roles' relating to specific functions and identify certain roles as 'required roles' for a given operation/application. For instance, 'DB Writer' can be defined as a role that describes a member in the GemFire distributed system that writes cache updates to a database. The 'DB Writer' role can now be associated as a 'required role' for another application (Data feeder), whose function is to receive data streams (for e.g., price quotes) from multiple sources and pass on to a database writer. Once the system is configured in such a fashion, the data feeder will check to see if at least one of the applications with role 'DB Writer' is online and functional before it propagates any data updates. If for some reason, none of the 'DB Writers' are available, the price feeder application can be configured to respond in one of the following ways - a.) block any cache operations, b.) allow certain specific cache operations, c.) allow all cache operations or disconnect and reconnect for a specified number of times to check if the required roles are back online.

Describe how certain volatile scenarios (such as split-brain syndrome) are handled
GemFire provides a membership API that can be used to detect a split-brain scenario. The solution to the problem requires 3 members at a minimum. When a member detects ungraceful exit of another member, it announces this to other members. Similarly, the other members do the same. If more than one member indicates the unexpected exit of the same member, the member is presumed unreachable. On the other hand, the member who lost its network link, will notice that all the other members have ungracefully left the system and can take a different course of action.

GFE can also automatically handle and recover from issues such as network segmentations, which cause the GemFire distributed system to become disjointed into two or more partitions. In such a scenario, each member in a disjointed partition evaluates the availability of all the required roles (discussed in the above question) in that partition, and if all such roles are available, then that partition automatically reconfigures itself and sustains itself as an independent GemFire distributed system. On the other hand, if all the required roles are not found in a partition, then the primary member in that partition can be configured to disconnect and reconnect to the GemFire distributed system a specified number of times. If this reconnect protocol fails, the member shuts down after logging an appropriate error message. With this approach, a network segmentation/partitioning is handled by the distributed without any human intervention. As a result, complex and expensive network 'merge' operations are not required once the segmentation issues have been fixed, and data inconsistency is avoided. This level of built-in intelligence and self-healing capabilities greatly enhances the productivity and efficiency of network engineers and application architects, who often spend several hours monitoring, identifying and debugging such issues in complex IT environments.

Configuration

Describe the configuration process and capabilities. What components can be configured?
GemFire can be configured programmatically using either the Java admin API or with the JMX API.
GemFire caches can be configured through three configuration files; GemFire.properties, sysconfig.xml, and cache.xml. The GemFire.properties file contains the settings required to join a distributed system. The sysconfig.xml file governs configurations of shared memory that is used to facilitate 'C' cache access. This is optional and only used when data is being managed in a shared memory segment. The cache.xml file contains region and entry configurations that are used to initialize a cache at creation time.
The cache.xml is the central place for describing all the data regions, their distribution and consistency characteristics, eviction model, external data loading and synchronization callbacks, etc. Data that needs to be preloaded into the cache during startup can also be specified in this descriptor.
GemFire also includes management tools for distributed system configuration. The GemFire Console and the GemFire command-line utility allow you to view and modify configuration attributes for distributed systems and individual system members.
The components that can be configured include: shared memory segment, embedded cache, cache servers, high availability and failover of cache servers, distribution system - IP Multicast or TCP locator setup, etc.

Is configuration web-based or via some other method (e.g. XML text files)? Is this centralized?
An optional JMX agent can be used to configure and manage a distributed system. GemFire bundles a HTTP adapter that is JMX enabled allowing some configurability from a web browser. The primary means of managing the cache is done through a centralized Swing based console. The GemFire console can be run standalone and can be used to manage multiple independent distributed systems.

If the interface is web-based, what is it built upon (e.g. JSF, Struts)?
The JMX agent uses Jetty as its servlet engine. GemFire console is built using Java Swing technology.

Can the web-based configuration console be portalized (JSR168)?
The GemFire management functions are exposed as JMX Mbeans making it possible to create a Portlet façade on top quite easily.
The servlets in the GemFire management agent could also support a JSP tag library.

Can configuration changes be applied dynamically to a running system, or does it need to be restarted?
Yes, with restrictions. For instance, you can add a new mirror cache to the system to increase the availability of the system and data, add capacity to a partitioned cache, change the eviction characteristics, etc. But, certain operations like dynamically reconfiguring a local cache to be distributed is not permitted. The possible configuration changes are designed such that there are no compromises on data consistency or introduce the possibility of race conditions.

If a configuration API is provided, is it JMX-based?
There is a proprietary Java admin API and a JMX API. The Java API offers significant simplicity in comparison to JMX.

How is configuration information stored? In an XML file or another database repository?
In XML files.

Does your solution allow for registering call-backs on occurrence of changes in the configuration of the distributed cache, including topology and cluster membership changes?
Yes. GemFire admin API permits applications to register a listener for membership level callbacks. The callback provides member arrival or departure information. This includes the identifier for the member cache along with severity (when the cache exits)

Does your system provide alerts? If so, under what conditions are alerts triggered? How are these alerts delivered?
Alerts can be configured through the GemFire console or using the GemFire admin API. Alert severity level can be configured. For instance, if the distribution system notices that the ACKs received take longer than the configured time limit, alerts can be raised (in the system log and the console).Alerts get triggered under other exception conditions too.
Monitoring health in a distributed system that is primarily main-memory based is becomes very important as the cache and the application are going after the same set of resources. Application administrators might want to take proactive automated action as the health of the system deteriorates to maintain the availability and QoS of the application.

How easy is it to customize the configuration console?
Most of the configuration options have defaults. Developers can start using the distributed with almost no explicit configuration and slowly tune the system.

System Monitoring and Management

Do you have an API for monitoring a running system?
The health monitoring API allows you to configure and monitor system health indicators for GemFire Enterprise Distributed Systems and their components. There are three levels of health: good health that indicates that all GemFire components are behaving reasonably, okay health that indicates that one or more GemFire components is slightly unhealthy and may need some attention, and poor health that indicates that a GemFire component is unhealthy and needs immediate attention.
Because each GemFire application has its own definition of what it means to be "healthy", the metrics that are used to determine health are configurable. GemFireHealthConfig provides methods for configuring the health of the distributed system, system managers, and members that host Cache instances. Health can be configured on both a global and per-machine basis. GemFireHealthConfig also allows you to configure how often GemFire's health is evaluated.
The health administration APIs allow you to configure performance thresholds for each component type in the distributed system (including the distributed system itself). These threshold settings are compared to system statistics to obtain a report on each component's health. A component is considered to be in good health if all of the user-specified criteria for that component are satisfied. The other possible health settings, okay and poor, are assigned to a component as fewer of the health criteria are met.

What performance runtime statistics can be gathered and monitored? What do these statistics determine?
GemFire incorporates a sophisticated statistics management system. The statistics sub-system collects cache usage, performance, distribution, locking, memory footprint, CPU utilization, etc at runtime. The collected statistics are periodically written to a statistics archive database. GemFire offers real-time statistics aggregation and correlation capabilities in the Console as well as a graphical tool called Visual Statistics Display (VSD). GemFire exposes the statistics API to the application making it possible to correlate application level events with events in the cache. The runtime statistics serve multiple purposes: from debugging hard to find performance related issues, leaks to fine tuning the deployment in production.

How is the information presented? Is this centralized?
Graphically in the console as either charts or numerical values in cells. VSD provides all stats in charts.
VSD is capable of reading multiple statistic archives (from many members of a distributed system) to analyze the behavior across the entire distributed system.

How easy is it to create customized reports? Describe the process of creating one.
Charts can be custom built in VSD. i.e. any number of related stats can be combined and viewed in a single chart. For instance the administrator might want to analyze the impact of CPU utilization on the cache access performance. These custom charts can be saved as templates for later reuse.

Do you provide a JMX-based monitoring API?
All GemFire managed stats are accessible through the JMX interface.

Programming Model

What is the programming model of GemFire Enteprise? What language is it written in?
Java, C+, and C# are the primary programming models for this product. The underlying product functionality has been developed in Java (GemFire Enterprise) and C++ (GemFire Enteprise - C++) respectively. Support for C#/.NET and C# access to though a JNI bridge.

What APIs can be used to access the cache? Provide both standard and vendor-specific examples.
The product exposes multiple access APIs: JCache, JTA, JMX, ODMG OQL, JMX, C, C++ and C#.

What languages does it support? Java, C/C++?
Native support for Java and C/C++. JNI-based API's for C#.

Does GemFire allow for use as a distributed work scheduler?
The combination of GemFire statistics and the underlying data distribution and notification services can be used to easily build a rudimentary Job/Task scheduler. GemFire stats provide continuous system metrics such as CPU utilization, available memory, etc across the distributed system which can be capitalized by a scheduler component to do load based routing of tasks.
GemStone is partnering with compute grid vendors and augmenting their solution with a data gridsolution. Learn more about how GFE can utilized in grid environments.

Standards and Platform Support

Is your solution JSR 107 (JCache) compliant?
Yes, the implementation has been based on the publicly visible initial JSR document.

What OS platform(s) your does your software support?

  • Any Unix flavor
  • Windows 2000, XP,
  • SuSe Linux Enterprise 8,
  • SuSE Linux 9,
  • RedHat ES 3, AS3
  • ZLinux & ZOS mainframe

What language is your software written in?
GemFire Enterprise offers both native Java and native C++ implementations (with interoperability and some variations in features). GemFire also provides API's in C# and has an optional native 'C' layer for applications wanting to manage data for multiple application on a single machine through shared memory segments.

  1. What is GemFire Real-time Events (RTE)?

  2. Sample usage patterns

  3. Architecture

  4. Event modeling and language support

  5. Performance and Scalability

  6. High Availability and Fail-over

  7. Integration with Other Technologies

  8. Security

  9. Platforms and Standards Support

What is GemFire Real-time Events (RTE)?

GemFire RTE is an in-memory event processing and analytics product based on continuous querying (CQ). From an architectural standpoint, it offers a relational storage engine that can handle fast changing data inputs (thousands of events/second) and also store static data originating from other enterprise databases and data warehouses.
You can download the GemFire Real-Time Events Technical White-paper and an evaluation copy od the software here.
This sample application (link TBD) provide further details on the use of GemFire RTE.

Sample usage patterns

  1. Real-time Desktop/Web applications - C# and Java client or web applications can connect to RTE and register continuous queries (SQL) to deliver real-time updates to users in portals designed for purposes such as online trading, eCommerce, travel reservations, etc.
  1. Event-driven SOA and Web Services - RTE can be used to track state changes or specific events and the callback mechanism can invoke relevant web services to initiate event-driven workflows.
  1. Pattern detection - Retail point of sale data, credit card transactional data, or any such real-time information streams can be fed into RTE for detection of specific patterns (represented as CQs). As and when a data entry or event is fed into RTE, pattern matches are identified, notifications sent to the impacted clients so that necessary action can be taken via the callback interface.

Architecture

Describe the RTE architecture
The GemFire™ Real-Time Events™ architecture merges several enterprise middleware components into a single system for convenience and technology synergy:

  • Continuous Query (CQ) Engine—data stream filtering & efficient SQL-based client/server interaction with zero polling.
  • Operational Datastore—data integration, dynamic data revision, and ad-hoc historical querying
  • Messaging Bus—event notification to remote clients

From a process viewpoint, RTE consists of one or more server VMs and a number of client VMs working to configure, populate, and query event data. Clients can run standard JDBC queries, and can also register continuous queries (CQs) with the servers. These CQs receive initial result sets when they are registered, and can be configured with listeners to receive result set updates at regular intervals. The events supplied to the listeners include row delta information that is supplied to listener callbacks and is used to keep the result set up-to-date. The client can also explicitly request a result set at any time for any CQ that it has registered with a server.

Describe how might RTE be used to support a distributed workflow in which each node in the workflow graph represents a small amount of work and state
There are two scenarios for distributed workflow. One involves RTE stand-alone where processing of workflow activities can be on remote nodes or embedded (server-side) functions. The second distributed workflow scenario involves seamless integration with our GemFire Enterprise product to provide shared workflow state across nodes. In this scenario, workflow processing updates can be sent to a colocated GemFire Enterprise cache (versus updates being sent to the RTE primary server). Here are high level illustrations of the two scenarios:

How is event routing handled across nodes within your system? When is event routing explicit and when is it implicit?

Event routing is explicit between server and remote clients. RTE provides built-in messaging between client and server nodes so that client applications may be developed completely in an event driven manner. When a new event that includes changes to a Continuous Query ResultSet is detected on the server, the client is explicitly notified via a callback API known as a CQListener interface. The CQListener interface provides two methods that apprise the client of notification events. The beforeResultSetUpdated() callback informs the client that an event is being routed to the client. Once the client application returns from the beforeResultSetUpdated() callback, the client side RTE logic will then apply the ResultSet updates and then invoke the second explicit callback API: afterResultSetUpdated(). The afterResultSetUpdated() callback informs the client application that new events have been captured on the server and applied on the client's node. The parameter passed to the afterResultSetUpdated() method is a CQUpdate object. From a CQUpdate, the client application can extract precise information on what rows in the ResultSet have changed, the way a row has changed (insert, update, delete), and what the new field values are.

Event routing is implicit between server processes, viz., between the primary RTE server and secondaries. RTE utilizes a replicated state machine approach for high-availability. Therefore, any DML and DDL events are implicitly routed from the primary to any number of secondary nodes.

Performance and Scalability

Please provide performance numbers for your solution. Please include the hardware configuration used to attain this performance.
The RTE performance benchmarks are performed on a single server process running on a dual Xeon 3.6Ghz blade. Clients are represented by 10 client VMs on a separate equivalent machine. The answers below provide a comprehensive view into the performance and scalability characteristics.

What is the largest number of event streams currently supported by RTE in a production environment?
The number of streams supported by RTE is not bounded. A 'stream' can be represented either as a table or as a client input thread. RTE has demonstrated hundreds of input clients and hundreds of tables in the system.

For a simple join on two streams, how many events per second does you product support?
Join queries are handled as incrementally maintained materialized views in RTE. With a simple join, 10,000 events/sec can be achieved

What is the largest number of simultaneous queries currently supported by RTE in a production environment?
RTE has demonstrated support for thousands of queries per server node with tens of thousands of input events per second. Servers can handle more queries if there is less input, and scalability can be achieved by adding more servers.

How can event streams be partitioned among many machines to provide scalability? If so, how many machines has RTE supported in a production environment?
If there are distinct sets of streams, they could be run on completely separate RTE distributed systems. However, this is often not practical, as the streams are interrelated and clients will need access to all streams. Rather than the number of input streams controlling the throughput in RTE, it is really the number and complexity of the continuous queries that cause workload. Queries can be load balanced across multiple servers, achieving horizontal scalability. RTE supports any number of servers, but has been demonstrated with 10 servers running in one distributed system.

If streams are partitioned, what constraints are placed on the kinds of event conditions that can be evaluated on the partitioned stream?
No constraints.

What strategies do you provide for handling cases where the incoming event rate exceeds what can be processed?
RTE has asynchronous event processing with internal queues that are designed to handle event rate spikes.

Does RTE support parallel threaded execution within a single operating system process? If so, how many CPUs have been supported in a production environment?
RTE is a highly parallelized product that can make use of any number of CPUs. Testing has been performed with 16 cpu machines.

Does RTE provide deterministically bounded latency?
RTE provides for configuration of a per-query response time agreement. The typical response time agreement is 200ms, although we have successfully reduced this to 5ms in customer projects. While events may be delivered faster or slower than this, there are numerous controls available to ensure that events will be delivered in batches at this interval.

High Availability and Fail-over

Does RTE provide failover and recovery?
GemFire RTE provides sophisticated replication, failover and recovery behavior. In order to provide failover, RTE supports running any number of server processes that are kept in synchronization with an asynchronous fault-tolerant replication mechanism. Client processes connected to this cloud of servers need only know the address of one server to connect to the system, and any redirection, failover or recovery is transparent to the end user.

Does RTE provide hot failover? If so, how long does it take for a node to failover?
GemFire RTE provides hot failover through a transparent failover mechanism, in which client connections and continuous queries are re-routed to the new primary server. There is minimal disruption of service and no noticeable impact to the client.

If failover is provided, can the failover configuration be changed dynamically?
The failover configuration is dynamically and automatically managed in the distributed RTE system. Server processes continually share membership information and support a dynamic join/leave mechanism. New servers to the system can obtain an initial image of the data in the system automatically by simply joining the system. Clients are made aware of new servers to the system and removal of servers from the system. The client processes can also dynamically load balance to the newly added servers.

Under what scenarios would events be lost during a failover and recovery circumstances?
GemFire RTE has gone to great lengths to prevent event loss, and as such offers a number of techniques and configuration options regarding event loss during failover. Due to the fact that the replication mechanism is asynchronous in RTE, it is possible that a client may submit an input event that does not make it to all replicas before the primary server crashes. This 'event gap' problem that is inherent in systems that use asynchronous replication can be completely avoided or mitigated by an advanced failover feature RTE supports called 'Replay'. This is achieved with a client-side cache of input events, which can be used to fill in event gaps in the occurrence of a failover event. This is transparent to the user and is orchestrated behind the scenes. It is still possible that event loss could occur, for example in the instance of a primary server crashing at the same time as a client crashes, or the client 'Replay' caches may not contain enough events to fill in the event gap. There is a configurable number of events to 'skip' in this instance. It is also possible to configure the RTE to shutdown on any message loss.

Integration with Other Technologies

What level of integration does RTE provide with J2EE products? How easy is it to integrate with WebLogic Server?

GemFire RTE is a pure Java solution that is packaged as a single jar file and is highly embeddable in any J2EE container.

How does RTE integrate with event streams? Does it natively support financial market event streams? How are these streams handled?

GemFire RTE is designed as horizontal infrastructure that enables programmers to model event streams as simple tuple streams. We currently have not focused attention on developing customized financial market event stream handlers but this wouldn't be difficult.

How does your system allow functions written in Java to be included in event conditions? Functions written in C/C++?

User-defined functions written in Java may be created and then invoked from predicates. C/C++ functions can be wrapped in Java Native Interface calls.

How does your system allow functions written in Java or C/C++ to be executed when an event condition matches?

Because GemFire RTE is written in Java, we exploit dynamic class loading and reflection to create a powerful extension to core product functionality, that is, the ability to create User-Defined Aggregates (UDA). UDAs enable developers to specify arbitrary aggregation logic in place of built-in aggregates (e.g., COUNT, MAX, MIN, SUM). UDAs can be used anywhere an aggregate function is allowed in the SQL syntax. UDAs are dynamically loaded Java classes implementing three interface methods:

  • open() performs initialization for the computation of the aggregate on the set of rows * participating in the aggregation.
  • next() iterates on the set of rows and incrementally builds values for the aggregate computation.
  • close() performs the final computation to arrive at the aggregate result, and also may perform clean up.

UDAs extend SQL enabling stream events to be analyzed with a state based reasoning logic. For example, moving average queries or even sequence based queries (where one event happens after a previous event) can be elegantly expressed and efficiently implemented. For example, given the tuplestream:

a user could compute the percentage of the top n trades for each stock symbol for the stock ticker data received in the last one hour as:

Does RTE integrate with any other products?

GemFire RTE is designed for easy integration with the GemFire Enterprise. The synergy of the GemFire RTE and the GemFire Enterprise produces what we call an 'enterprise data fabric' consisting of horizontally scalable analytical processing pipelines:

Within each analytical processing pipeline, stream data passes through the following phases:

  1. Capture—high speed stream events are buffered in the distributed cache
  2. Integration—disparate stream events are merged into an integrated data model by the operational datastore
  3. Filtering—continuous queries filter the integrated data model looking for patterns of interest
  4. Aggregation—data is aggregated by user-defined functions and analytical libraries
  5. Distributed—analytical data is distributed to remote stakeholders via the distributed cache

Adding parallel Java virtual machines easily scales up this data fabric.

Security

Describe the security architecture of your solution?
GemFire Real-Time Events provides a user authentication callback that is invoked on connection and continuous query registration. An application can create a class that implements the interface com.gemstone.gemfire.sql.SecurityListener and specifies that class at server configuration startup. For example, to start a Real-Time Events server process with the implementation listed below of SecurityListener (mysecuritypackage.MySecurityListener), do the following:

prompt> productDir/bin/gfrteserver -gfsql.securityListener=mysecuritypackage.MySecurityListener

Event modeling and language support

How are event types described in your system, whether graphically or with a textual language?
Event types can be easily modeled with SQL Data Definition Language commands.

Do you support any standard schema types for describing events (e.g. XML, SQL DDL variant, others)?
SQL DDL.

What operators and built-in functions does your system support for event conditions involving specific events?
The Continuous Querying syntax currently supports the following:

  • N-way joins; 1-1, 1-M, M-M cardinality
  • Outer joins
  • Timestamps
  • Relational operators: <, >, <=, <=, =, Like
  • Boolean operators: AND, OR, support for the 'IN' clause
  • DISTINCT results

Is it possible to develop rules using something like a state machine or Petri-net diagram? If so, describe how this is programmed. Please provide examples.
Definitely. As mentioned previously, UDAs enables Turing complete, state-based reasoning.

What techniques are available for defining aggregations of events over a stream or set of streams (e.g. sliding windows based on time or number of events, etc)?
UDAs. Using group-by keys that are based on time can specify sliding windows.

What built-in functions are available for aggregations of events?
We currently support MAX, MIN, COUNT, SUM, AVG.

Can sets of aggregations be defined (similar to group-by in SQL)?
Yes.

Does RTE provide correlation of events from multiple event streams?
Certainly through joins and views.

How does your language allow conditions to be written that involve specific or recurring calendar times?
Timestamps can be included in query predicates.

How can a condition trigger when a complex event condition does not occur within some time window?
We keep all events in historical storage so event detection can span the entire history.

Describe what actions can be executed after an event query is triggered? For example, generate internal events, send alerts, perform operations, or advance a workflow.
Actions taken by the consumer of the event notification are user-defined. Any Java function or application logic can be executed.

Can system execute queries that include both streamed and non-streamed (e.g. relational DB) data sources?
Yes, RTE stores both.

If so, how are non-streamed data sources evaluated, cached and accessed?
Cached in memory in a high speed operational datastore. Non-streamed data can be accessed using ad-hoc SQL queries.

Do you accept external timestamps for specifying event timing? If so, can you tolerate events arriving out of order wrt their external timestamps?
External timestamps can be modeled as columns in the data schema and then query predicates can reference these columns.

What event sources do you support/integrate with? What adapters do you provide for event capture?
Relational data sources can be readily pumped through our JDBC driver interface. Open-source adapters like Object/Relational adapters (e.g., Hibernate) can be used for integrating object-based event sources.

Can multiple physical event sources contribute to a single logical event stream? If so, can event sources be added dynamically?
Disparate event sources can be logically joined into a view; queries can then reference this view.

What capabilities do you have for dealing corrupted events in incoming event streams and diminishing negative impacts of such events?
Events can be modeled as not only append-only streams but as updateable streams, so if corrections must be made, the stream can be updated and the continuous query result set will be transparently updated.
How is event routing handled across nodes within your system? When is event routing explicit and when is it implicit?

Event routing is explicit between server and remote clients. RTE provides built-in messaging between client and server nodes so that client applications may be developed completely in an event driven manner. When a new event that implies changes to a Continuous Query ResultSet is detected on the server, the client is explicitly notified via a callback API known as a CQListener interface. The CQListener interface provides two methods that apprise the client of notification events. The beforeResultSetUpdated() callback informs the client that an event comprised of RowDeltas is being routed to the client. Once the client application returns from the beforeResultSetUpdated() callback, the client side RTE logic will then apply the ResultSet updates and then invoke the second explicit callback API: afterResultSetUpdated(). The afterResultSetUpdated() callback informs the client application that new events have been captured on the server and applied on the client node's JDBC ResultSet. The parameter passed to the afterResultSetUpdated() method is a CQUpdate object. From a CQUpdate, the client application can extract precise information on what rows in the ResultSet have changed, the way a row has changed (new, updated, deleted), and what the new field values are.

Event routing is implicit between server processes, viz., between the primary RTE server and secondaries. RTE utilizes a replicated state machine approach for high-availability. Therefore, any DML and DDL events are implicitly routed from the primary to any number of secondary nodes.

Combinations of event processing nodes can create interesting patterns of event propagation. Does your system provide any mechanism for supporting and analyzing such distributed patterns?

Yes. You can use one (or several) RTE processes to "pre-process" data through aggregation and other methods. Processed results can then be pushed into another RTE process responsible for managing higher-level patterns.

Platforms and standards support

What OS platform(s) does your software support?
Any platform that supports a Java virtual machine runtime environment. Our software is 100% pure Java for maximum deployment heterogeneity.

What language is your software written in?
Java on the server side. Client API's available in Java, C#, and C++.

What network protocols does it support/rely upon?
TCP and IP/Multicast (Multicast is not required).

Adaptavist Theme Builder Powered by Atlassian Confluence