Scalable Object Persistence (SOP)

100 Million transaction protected inserts in 17 hrs in a Laptop!



Product Discussion
1 Simple Data Persistence
At a glance, SOP is a Framework for Object Persistence. Its .Net implementation is housed in a single DLL extending the .Net framework's Generics' Collections to provide Object Persistence transparently. By switching to use the IDictionary implementing Stores in SOP instead of the Generics' Dictionary, your application's Objects are persisted transparently.
It's as simple as that, no new Application to learn, just a few extra methods to learn and a reference to Sop.dll. The very same non-intrusive, simple technique on data persistence introduced and promised by us way back 2001/2002.

Following is a subject matter discussion for why simple data persistence...

Memory is an expensive resource. Application managed Objects are typically limited by the amount of RAM available and the Application has to employ its explicit storage routines to store Objects or Application Data onto a disk based solution. Such as in the case of using Xml Serialization onto Files on disk OR persisting data onto a RDBMS such as Oracle or Microsoft SQL servers. The current available solutions such as utilizing client connectivity libraries to said RDBMS servers (e.g. – ADO.Net), Application developers are required to devise specific modules to map Object or data schema to respective underlying schemas in the data stores of said RDBMS servers OR serialize them onto Xml Files, in the latter case utilizing Xml Serialization techniques. The mapping layer ranges from simple (Xml Serialization) to comprehensive (ADO.Net to RDBMS Server). Xml Serialization is a very simple solution that I would like to extend.

The idea is to offer an additional simple persistence method that complements the currently available solutions. In the process, create a new, seamless, stable and scalable avenue of creating applications that have needs to store their application/system information onto disks. Xml Serialization as mentioned is a very simple persistence method but with this simplicity, it comes with it its limited scalability. It requires entire data stream to be loaded into memory and offloaded in its entirety onto disk. The problem arises when Objects or Data becomes huge that they don’t fit in Memory. Of course, I do recommend using ADO.Net/other client libraries and persisting to your desired database servers such as said SQL servers, this way your company can continue to utilize the rich tool-set available for managing your enterprise data. However, it doesn’t mean the current solutions available to application development in the area of data persistence are already complete and there is no need for additional methods. In fact, it is recommended for developers to continually learn of new techniques and tools in this area and so when a problem arises, they will have the necessary tool to get the job done in the most painless and sometimes effortless manner. As an additional data persistence method that is well aligned for simple data persistence as Xml Serialization offers, BUT which brings stability and scalability in the solution, that is where SOP comes in. If you are writing .Net applications, you do need SOP much in the same way there is a Standard Operating Procedure in doing almost any task, B-Tree Gold v4 SOP is what it is for the application developers, in the general application level sense. It is a very good addition to your tool library. Following is a sample code snippet to demonstrate data persistence using SOP. It creates a data store with both string type for its Key & Value pair, on a specific drive/folder location, adds a string entry and commits the transaction.

//**NOTE: this code block had been edited for brevity
ObjectServerWithTransaction server = Transaction.BeginOpenServer(
"c:\\OServer.dta", new ServerProfile(ProfileSchemeType.MinimumDevice));

Virtual.ISortedDictionary StringsStore = SortedDictionary.GetSimpleKeyAndValueStore(
Server.SystemFile.ObjectStore, "StringsLookup");

StringsStore.Add(“Hello!”, “World”);
server.Transaction.Commit();
2 Features & Benefits
2.1 Scalability
Our test results show SOP can do 1 million record inserts, with 250,000 unique records persisted in each of four tables, in less than 3 minutes running in an averagely equipped 3 year old laptop. It performs about the same length of time reading sequentially all those records, reading from main table, doing searches (Query) for related record lookups on the other 3, in ascending order again using the same machine. It is very impressive seeing this kind of performance delivered by a data engine housed entirely in a ~260 KB sized DLL. SOP utilizes advanced caching, virtualization and data I/O algorithms to deliver this level of performance. Database cluster layout can include data stores stored on separate disks to take advantage of parallel I/O offered by said different disk drives. Said tests weren’t setup to do multiple disks scenario, thus, it is anticipated for SOP to perform better in beefy hardware setup being able to utilize multi-disk heads for parallel I/O. Data partitioned configuration performance hasn’t been measured in such hardware setup.
2.2 Stability
SOP employs a unique, robust and scalable transaction model. This technology protects your data and ensures the application will never go into an unrecoverable state even if it crashes or the machine it is hosted in crashes, without sacrificing performance. SOP transaction if left in an uncommitted state, will recover and rollback the SOP Database(s) into the last committed state either explicitly (Rollback function) or on Application restart (RollbackAll function). The Scalability tests discussed above were done within the context of a transaction. In SOP, it is recommended to use a transaction whenever you will do bulk or single update/insert/delete changes to your data store(s). SOP's transaction is scalable and proven to handle with no observed performance degradation hundreds of thousands of changes within the session. Thus, all your bulk (or single) change operations are always protected and won't be in danger of corrupting your database.
2.3 Simplicity
SOP exposes its Data Stores implementing the .Net Generics’ IDictionary interface, extends this to add necessary persistence methods such as Save & Transaction commit/rollback. The general essence and simplicity of using Generics and the IDictionary interface, keeps the learning curve for SOP very low as existing developer skills are leveraged 100% when using SOP for persistence. Save your Plain Old CLR Objects (POCOs) as they are onto SOP data store, search, access and reconstitute them directly onto the Objects you’ve defined. There is no Object to Relational Mapping needed, very much comparable to simplicity of Xml Serialization for persistence, BUT done in a very scalable and stable manner. In fact, Xml Serialization is one of the supported serialization methods in SOP. Although instead of requiring entire object stream to be loaded/offloaded to/from memory, SOP virtualizes that and only keeps a handful of Objects in memory for caching.
2.4 Lightweight
SOP is housed in a single DLL, that is it. It has support for application dictated encoding so your data texts can be saved in your preferred encoding scheme, by default it uses UTF-8.

It has transaction support for stability and is very scalable, which means it can go upscale of enterprise level scaling easy. It requires only about 35 MBs of RAM in 32 bit machine managing about a million records utilizing SOP's minimum device profile.

2.5 Flexible Licensing Model
SOP comes with very high value different licensing models catering for students, individual developers to enterprise customers. Please contact 4ATech's sales representative for licensing details.

3 Sample Usage Scenarios
Some of the real world potential usages of SOP data engine.
3.1 Simple Data Persistence Engine
As described on top section above, SOP provides very seamless data persistence to your applications. Use your existing knowledge in Generics, Collections and Xml Serialization to be able to start utilizing SOP for your application data persistence.

Keep in mind SOP offers an additional method to persist your data and not as a replacement to SQL Servers available in the market. We've designed SOP to fill in a niche in the data engine/data platform space which includes embedded database applications, middle-tier DB, etc... This has always been the cornerstone of B-Tree Gold since its initial inception and market availability way back 1998.

3.2 OS/Kernel level Registry
SOP because of its minimal resource requirements can be a very good data platform to your Operating System or your Kernel level high volume Registry. In fact, because of its embedded Virtualization, your OS/Kernel can suddenly manage large volumes of data if using SOP data stores, in a very stable manner. Imagine being able to rollback corrupted data when machine crashes or rebooted in the middle of a transaction, such as while installing a device driver, power failure occurred or User turned off the machine. Without writing a single line of code, just Open the ObjectServer using SOP's Transaction's BeginOpenServer method and you'll bring your OS/registry state to a previous stable state, before the half done/aborted installation.

Virtualize your Objects by simply utilizing SOP data stores instead of .Net Collections.

3.3 Lightweight Embedded Database (LED)
Embed your database within your Application. No need for separate data servers, SOP hosted within your application.

3.4 Data Engine
SOP as a lightweight, stable, scalable and flexible data engine can be extended and used as a platform for writing higher level database engines. Use it, extend and focus on your higher level functionalities, and don't get bothered by the data engine level plumbing.

3.4.1 RDBMS Data Engine
Use SOP to author the next Relational Database Management System. Per our estimates result of our testing, SOP can be a very good platform for authoring higher level data engines. Build on top of SOP, support authoring and management of two dimensional data entities (e.g. - tables, rows, columns), add up SQL Query engine (can be via LINQ!), support for ACID transactions, row locks, database tool-set and wala, you have a new data server that can support Object persistence at low-level and at higher level, support your RDBMS data contraptions/projections.

3.4.2 Data Mining Engine
Use SOP as your data mining Application's data engine. Since SOP is managed code natively, you can utilize c#/.Net language to author your data mining algorithms. The simplicity and expressivity of such language translates to a very granular control in your algorithms without using and inheriting slow-ness of Database Cursors. Use c# all the way and perform in bulk mode everytime.

3.4.3 Indexing Engine
Use SOP as data engine for your document indexing needs. All data indexing, no added costs imposed by RDBMS-ness, just pure indexing of your data!

The list of potential usage of SOP as data engine solution seems very vast and may not fit in this paper if we'll attempt to list them all.