We thank Jordan Lenchitz for his first guest blog post, and hope there are many more to follow. If you would like to post on the YottaDB blog please contact us at info@yottadb.com.
When it comes to experimental video art the database you choose shapes what is possible, what is fast, and what is sustainable.
My Creative Process
Step one: global generation. I build MUMPS routines that encode my compositional ideas into global subscripts. These routines embody the algorithmic decisions that define a unique work of experimental video art. Each run of a MUMPS routine generates thousands of nodes representing frequencies, timings, spatial coordinates, color values, and whatever my experimental video art requires.
Step two: replication filters. My generated globals flow through replication filters to apply the “business logic” for the experimental video art on SI replicas.
Step three: relationalization. SQL queries run on the replicas to map the global subscript structure onto relational tables. This is where the raw algorithmic output becomes queryable: I can ask “give me all frequencies between 400 and 500 hertz” or “find the points where two voices intersect” without regenerating any globals.
Step four: connection to Pure Data. I connect to the SQL database from Pure Data, the realtime visual programming environment, using its PostgreSQL connector. Pure Data reads the relationally mapped data and renders it, synthesizing the audio, driving the visuals, creating the final artwork. The database becomes the bridge between algorithmic conception and sensory experience.
Why YottaDB
I could have used a relational database for steps three and four and stopped there (and many people do!) but then I would have lost steps one and two. I would have written the algorithms in some other language, serialized them into SQL, and given up the expressiveness of MUMPS for algorithmic composition. YottaDB gives me the whole stack. MUMPS is extraordinarily good at manipulating hierarchical data, which is what deeply nested algorithmic composition actually looks like. Global subscripts are how these systems want to be written, not a workaround for something else. The replication layer means I don’t have to choose between computational power and queryability. I can generate at full speed in one environment and expose relational views to Pure Data without exporting or rebuilding anything. The separation between the algorithmic engine and the consumer interface happens at the database layer, not in application code. Performance matters too; generating thousands of data points, running replication filters across all of them, then querying subsets in real time as Pure Data renders is not a trivial workload and YottaDB has no trouble with it.
The Why Behind the Why
Databases are opinionated because they encode assumptions about what data looks like, how you access it, what’s fast, and what’s slow. Most of them assume relational structure from the start but YottaDB assumes you might want hierarchical algorithmic structure first and relational views of that same data later. For my creative work, this is the whole ballgame. Algorithmic composition lives in the hierarchical world where relations come later (as a view!). A database that forces you to think relationally first takes away the tools that make algorithmic thinking natural. YottaDB lets MUMPS be MUMPS and SQL be SQL in the same system, without translation layers or duplication. That has shaped every work of experimental video art I’ve made since 2023. If you’re building systems where algorithmic or hierarchical data generation matters and you need relational queryability later, YottaDB is worth a serious look. It trusts you to think in the structures that actually fit your problem and it will let you make strange and beautiful things.
About Jordan Lenchitz
Jordan Lenchitz is a MUMPS programmer and experimental video artist based in Verona, Wisconsin. He holds degrees in mathematics, French, and music composition from Indiana University and is completing a PhD in music theory and composition at Florida State University. (His dissertation examines listening to a cappella vocal musics as an extension of Eleanor Gibson‘s ecological approach to perception.) His creative work centers on extended just intonation and algorithmic composition as well as contributions to open source projects. You can listen to his video work and contact him through his website.
Credits
- Screenshots from pareidolia and in memoriam ben johnston used with permission of Jordan Lenchitz.
Koalas are some of the cutest animals around, but are listed as vulnerable by the International Union for Conservation of Nature and Natural Resources and listed as endangered by the Australian Department of Climate Change, Energy, the Environment and Water.
I wanted to explore a JSON dataset with the ZYDECODE command introduced in YottaDB r2.04. I found a JSON dataset of Koala records for twenty years to 2014 in Noosa Shire in Queensland, downloaded it, and renamed the file to NoosaShireKoalaRecordsto2014.json. I then loaded the raw data into the line local variable from which I parsed the JSON into the ^koala global variable. Note that ZYDECODE expects the top level of line to be the count of lines that make up the JSON file, which is one less than i, since the last value of i corresponds to when YottaDB encountered the end-of-file.
YDB>set file="/Distrib/Datasets/NoosaShireKoalaRecordsto2014.json" open file use file YDB>for i=1:1 read line(i) quit:$zeof YDB>use $principal set line=i-1 zydecode ^koala=line YDB>
With some knowledge of the schema of the dataset, I used AIM to cross reference the number of koalas seen at each sighting, and determine the maximum number of these asocial animals that were observed at a sighting (which turns out to be 4).
YDB>kill sub set sub(1)="""features""",sub(3)="""properties""",sub(4)="""NUMBER_SEE""" YDB>set koalanumx=$$XREFDATA^%YDBAIM("^koala",.sub,,,,,,1) YDB>write $order(@koalanumx@(0,""),-1) 4 YDB>set x="" for set x=$order(@koalanumx@(0,4,x)) quit:""=x write x,! 480 744 YDB>
So what are these locations at which four koalas were sighted at a time?
YDB>for x=480,744 write ^koala("features",x,"properties","TOWN"),! Doonan Noosa Heads YDB>
So, if I want to see koalas in Noosa Shire, Doonan and Noosa Heads are the places I should visit!
More importantly, I hope you can see how useful YottaDB r2.04’s ZYDECODE is when used with AIM to explore and analyze JSON datasets. YottaDB r2.04 also has a complementary ZYENCODE command to convert global and local [sub]trees to JSON, using an encoding that allows [sub]trees with values at roots to be converted to JSON and back without data loss using ZYDECODE. We look forward to hearing from you as you try the functionality
Credits
- Photograph of Koala used under the Creative Commons Attribution 4.0 International licence (CC BY 4.0) license.
- Photograph of Noosa Heads used under the Creative Commons Attribution-Share Alike 4.0 International license.
Although it has been over a year since we released r2.02, we have not been idle. Unlike Santa’s elves, who must be ready in time for Christmas no matter what, r2.02 was such a robust release that we had the luxury of taking our time to get things into r2.04 that we wanted to. We couldn’t get everything in – in the software world, there is always something that has to be deferred – but we believe r2.04 was worth the wait.
We originally intended r2.04 to focus on performance, and it does. We blogged about critical section performance in Critical Section Performance in r2.04. But performance took on a life of its own, and we did so much more. Every release adds functionality, and the major functionality added in r2.04 is the ability to convert between M and JSON. And, as with performance, there is so much more in the release than that. You can read the release notes and see the development details. With everything in it, r2.04 is our biggest release yet, reminiscent of the Antonov An-225 Mriya above, which at 253 metric tons had the largest carrying capacity of any aircraft ever built.
Performance
In addition to the critical section performance enhancement, performance related enhancements in r2.04 include:
- Faster object code using the naked reference template – the compiler automatically detects places where it can use the faster object code template for naked references, including code where the M language does not support naked references.
- Garbage collection with no sorting – sorting consumes a major part of the time spent in garbage collection; eliminating it allows applications to run faster, while using more memory. In r2.04, an application can choose between speed and memory.
- Faster runtime code for both inside and outside critical sections – in addition to critical section performance, r2.04 includes many other optimizations.
- A JOB command that is as much as five times faster – applications that JOB processes to handle incoming TCP connections should benefit from this.
Below are two graphs to demonstrate the performance improvements in r2.04. The benchmarks were run on a Red Hat Enterprise Linux 10 system using the default garbage collection, i.e, with sorting. In our testing, we did not observe any material difference in performance between ext4 and xfs filesystems.
The benchmarks compare performance on the following YottaDB releases and GT.M versions:
- YottaDB r2.02, our previous release.
- GT.M V7.1-002, which has been merged into r2.04.
- GT.M V7.1-010, the latest GT.M version as of the r2.04 release. Although V7.1-010 is not merged into r2.04, you can compare the performance of r2.04 against V7.1-010.
- YottaDB r2.04.
In both cases, the y axis is runtime in seconds, i.e., less is better.
The first benchmark simulates interest posting on accounts by a real-time core-banking application.

The second benchmark uses cooperating concurrent processes to calculate the lengths of 3n+1 sequences for integers from 2 through one million.

Observations
- Both benchmarks show that GT.M V7.1-002 and V7.1-010 perform comparably.
- Both benchmarks show that with large numbers of processes, many times the number of CPUs, both GT.M versions scale better than YottaDB r2.02, i.e., GT.M V7.1-002 has improved scalability compared to V7.0-005, the GT.M version merged into YottaDB r2.02.
- The interest posting benchmark clearly shows YottaDB r2.04 outperforming r2.02 and both GT.M versions.
- The 3n+1 benchmark shows r2.04 outperforming both GT.M versions at all loads, and at the high end scaling up better than GT.M versions and r2.02 (i.e., having a lower slope).
Even as they highlight the performance of r2.04, the graphs illustrate an important fact about benchmarking: performance characteristics can vary widely between workloads.
Functionality
The most significant functional enhancement in r2.04 is the ability to export (encode / serialize / stringify) [sub]trees as JSON and to import (decode / deserialize / parse) JSON strings into [sub]trees. Since [sub]trees can have values at their roots as well as their own [sub]trees, which JSON does not support, the encoding format allows for such M [sub]trees (i.e., with a $DATA of 11) to be encoded and for the JSON to be re-decoded flawlessly, without any data loss. There are both C and M APIs (see example below); JSON parsing is done using the Jansson library. There are numerous additional enhancements to functionality, including:
- ZSHOW “V” able to display variables at a specific stack level – the ability to examine variable values obscured by called stack frames makes for easier debugging.
- For processes started with a JOB command, $ZYJOBPARENT has pid of parent process – there is no longer a need to explicitly pass this as a parameter when using the JOB command, or for a child process to look in the /proc filesystem.
- SET $ZROUTINES supports globbing of shared library filenames – simplifies application code that needs to include a number of shared libraries, e.g., plugins, in $ZROUTINES.
- Warning if a variable appears more than once in a NEW – catches inadvertent application programming errors.
- GT.M versions V7.1-000, V7.1-001, and V7.1-002 are merged into r2.04.
Fixes
Every YottaDB release must pass all the tests of predecessor YottaDB releases and more. We find and fix bugs and misfeatures, for example:
- WRITE /TLS does not set $TEST if no TIMEOUT was specified.
- Appropriate permissions for $ydb_tmp and, where appropriate, parent directory.
Since the upstream GT.M team only releases the source code for each version, but not the automated tests, we create our own automated tests when merging a GT.M version into YottaDB. During the process, we find and fix GT.M bugs and misfeatures, for example:
- MUPIP RUNDOWN runs down database files even when replication instance files do not exist.
- In Kubernetes pods, Source Server connects reliably with Receiver Server.
Note that the titles of issues describing fixes to bugs and misfeatures reflect the correct behavior after we fix the issues, and not issues as originally reported (since a fix may only be indirectly connected with original symptoms reported).
JSON Example
Here is an example of decoding a 29MB JSON string into a global variable, and encoding the global variable back into JSON.
YDB>set jsonfile=$zsearch("work/github/test-data/large-file.json") open jsonfile:readonly YDB>use jsonfile for i=1:1 read line quit:$zeof set json(i)=line YDB>use $principal close jsonfile set json=$order(json(""),-1) write json 11356 YDB>set begin=$zut zydecode ^m=json set end=$zut write $fnumber(end-begin/1000,",")," msec" 553.165 msec YDB>set begin=$zut zyencode jsonout=^m set end=$zut write $fnumber(end-begin/1000,",")," msec" 847.974 msec YDB>
The above was executed on a laptop with an Intel i7-1260P CPU.
In Conclusion
Sorry to keep you waiting, but all good things take time. YottaDB r2.04 has landed, and we look forward to hearing from you as you use it.
Credits
- Photograph of Antonov An-225 Mriya coming in to land used under the Creative Commons Attribution-Share Alike 3.0 Unported license.
- Photograph of MSC Loreto used under the Creative Commons Attribution-Share Alike 4.0 International license.
It’s a wrap! The end of a long journey, and YDBGo v2 is finally here with its pristine finish and a V8 rumble that’s champing at the bit to get some traction in your code.
YDBGo is the Go language wrapper for YottaDB, arguably the world’s fastest key-value database, with strong ACID semantics at scale, and a mature code base which has been in production since the mid 1980s.
YDBGo v2 is more concise, easier to read, and more “Go-like”. Version 1 was long-winded, hard to read, had complex function signatures, and the syntax was not “Go-like”. While v1 had two command sets called “SimpleAPI” and “EasyAPI”, v2 is simpler and easier than both (see the syntax comparison below). Version 2 is also faster than both, and better protects you from inadvertent bugs. These features are presented below along with other functional additions, risk reductions, internal improvements, and a final section on migration from v1 to v2 which includes a table of examples of syntax changes.
Speed Improvements
YDBGo v2 is faster than v1:
- For basic operations like setting a node (benchmarked here):
- 1-4% faster than the v1 SimpleAPI
- 4 to 7 times as fast as the v1 EasyAPI
- For a fully-loaded multi-process application (per this 3n+1 sequence benchmark):
- 1-3% faster than the v1 SimpleAPI
- v1 EasyAPI not reported as it was too slow to be worth benchmarking
How We Did It
Most of the speed gains were achieved in v2 by reducing memory allocations — database accesses use database objects (actually, the Go equivalent of objects: type instances). Specifically, v2 requires database access through a connection instance *Conn and node instance *Node.
Each Conn instance allocates buffers for API transfer of error strings, values, and, where necessary, M call-in parameters. A Node instance holds a database variable name and subscript names in API-call format so that rapid operations on that database subtree may be performed many times.
Although Child() nodes may be created from any Node instance, for fast iteration, Node instances may be Index()‘d to access sub-nodes without re-allocating a new Node instance for each subnode. Index() achieves this internally by yielding a “mutable” node instance which is re-used by changing (mutating) the stored subscripts. Iterators are provided to traverse the database that automatically use this mechanism by yielding mutable nodes as the loop variable.
Reduced memory allocations and mutating node objects both result in less garbage collection than the EasyAPI.
Calling M, the language embedded in YottaDB, is also faster with more intuitive syntax by allowing Go’s binary types (int, float, etc) to be passed as native parameter types to the API. This avoids needless conversion to and from strings with relatively slow conversion functions like strconv.Atoi(), and avoids associated error checking.
There are other minor speed gains. Database access via Node instances reduces the number of parameters stacked by each function call and also reduces the error checking needed by each function. Errors checks are now performed once when the Node instance is created rather than at each API call on that node, yielding a slight speed increase. The same mechanism allows fewer function calls to be made than with the SimpleAPI, achieving better readability and a small speed gain.
More specific details of speed improvements are documented in this Issue.
Functional Improvements
Since Go does not support C’s atexit() functionality, the API now provides an explicit handler function, ShutdownOnPanic(), which application code should defer at the start of every goroutine. This ensures that the database will be cleanly run down before the process exits, ensuring database structural integrity without an application needing to write its own panic handler. As for the main goroutine, the API Init() function now returns a handle that must be passed to defer Shutdown(): an intuitive cue to the programmer that the main goroutine also needs to be shutdown.
Version 2 correctly handles panics that occur in Go callback functions that traverse the CGo boundary: specifically, panics inside transactions or inside signal goroutines. Previously such panics would un-stack the CGo call directly, which is unsafe in Go. Now panics are caught inside the callback with recover, the error is returned through the CGo boundary, and re-panicked once back in Go. The re-panic error message also includes the traceback of the original panic so that errors inside transactions and signals may still be easily located. This paragraph can be fleshed out with examples in a future blog message. Let us know if you’re interested in such an article.
Documentation and Source Improvements
For v2, the official location of the API documentation has moved to the Go Packages website, where Go developers look for package documentation. The YottaDB Multi-Language Programmers Guide points to the Go Packages site for the Go v2 API. Documentation for each function now includes a leading purpose statement and numerous executable examples have been added.
When you click a function name from Go Packages website to take a peek at the YDBGo source code, you’ll find that v2 is more readable and maintainable. Internally, it uses more idiomatic Go, such as the range keyword, iterators, and atomic variables instead of pseudo-C versions of the same. Extensive use of struct type instances cleans up internal as well as API function signatures. The codebase is now fully compliant with the Go lint tool staticcheck. Yoda conditions have been replaced with more familiar Go syntax.
All the original unit tests have been implemented against v2, and automated test coverage for code has improved from 75% to 96%.
Syntax Improvements
The use of type instances to access the database simplifies the v2 API:
- API method names have been renamed according to Go policy, to be whole words, and the scope omitted now that it is clear from the parent type.
tptokenanderrstrneed no longer be passed to every API function as they are stored in theConninstance. EachNodeinstance also references itsConninstance, so the latter doesn’t need to be passed toNodeAPI functions.- The
ConnandNodetype instances harness Go’s garbage collector to automatically free its allocated buffers when each instance is no longer used. This means an application no longer has to callAllocandFreefunctions, and prevents inadvertent memory leaks that were possible with the v1 SimpleAPI. - Calling M from Go is now simpler and more Go-like using the new
Import()function.
See also: Examples of syntax changes.
Error Handling
Error handling has been standardized and simplified for readability:
- All API errors (returned or panicked) may now be identified by an error Code (negative codes for YottaDB errors and positive codes are for YDBGo errors). This allows the programmer to identify and handle specific errors as desired, even for panicked errors captured by a Go
recoverstatement. Nodemethods now automatically handle expected YottaDB return codes that are not errors. For example:- Iterators terminate on NODEEND.
Get()automatically allocates more space and retries on INVSTRLEN.Get()allows the user to supply a default value rather than returning GVUNDEF and LVUNDEF.Nodemethods now panic on all programmer errors (per Go policy), and also on database system errors like out of memory.- Panicking on system errors greatly improves readability of most database operations, which no longer have to check for errors. It also means that robust applications should use Go
recoverto capture database system errors in order to handle them gracefully rather than panic.
Signals
As with v1, signals that are used by YottaDB require special treatment if an application also needs to use them. The v2 API has changed the signal handling as follows:
RegisterSignalNotify()has becomeSignalNotify()and now matches the function signature of Go’s built-in signal capture functionsignal.Notify().- In addition, a v2 API signal handler no longer has to specify whether to pass the signal on to YottaDB before, during, or after the user’s handler. Instead, user handlers must simply call
NotifyYDB()at the point in their handler where they wish to notify YottaDB of the signal.
Migration from v1 to v2
Applications that use YDBGo v1 continue to operate without change, and YottaDB continues to support the v1 API. To aid migration of large applications from YDBGo v1 to v2, it is possible to use v1 and v2 APIs in the same application code. For details, see the migration section of the README.
- Here is an example Go program migrated from YDBGo v1 EasyAPI to YDBGo v2 as a diff.
- Similarly, here is a diff of another Go program migrated from v1 SimpleAPI to v2.
Below is a table mapping v1 syntax to v2. For further information, refer to the official documentation.
Examples of Syntax Changes
| v2 Syntax | v1 Syntax |
|---|---|
|
|
|
|
|
|
| Simple API | |
|
|
| Easy API | |
|
|
|
|
[cf. HasTree, HasValueOnly, etc.] |
|
|
|
|
|
|
|
| Locking | |
|
|
|
|
|
|
| Iteration | |
|
No iterator. Complicated using SubNextE() |
|
No iterator. Very complicated using NodeNextE() |
| Transactions | |
|
|
| Calling M | |
|
|
| Handling Signals | |
|
|
| Debug Output | |
|
|
|
No equivalent |
There you have it, folks. Let us know how you like it.
Credits
- Images generated by Google Gemini in response to prompting by K.S. Bhaskar.
- Code highlighted using Online Code Syntax Highlighter.
We have heard repeatedly from database users that for truly ACID transactions, no database matches the scalability of YottaDB, or its upstream GT.M. Our recent blog post blog post ACID Transactions Are Hard At Scale … Part 1 discusses why Consistency and Isolation at scale are hard, and ACID Transactions Are Hard At Scale … Part 2 discusses how YottaDB meets that need. This blog post discusses why ACID transaction Durability is hard at scale, and how YottaDB achieves it.
What is Durability?
We summarize Durability thus:
Durability means that once the transaction is completed, it is permanently recorded in the database and cannot be erased even if the computer crashes.
Since YottaDB, or any other database, is software, it relies on the correct operation of the underlying hardware. Without a guarantee of correct operation of the underlying hardware,1 correct operation of software cannot be guaranteed. Since the perfect hardware that never fails has not yet been invented, a more practical requirement is that hardware failure does not damage stored information, i.e., it is OK for computers to crash, but when a computer crashes, data that has been committed to nonvolatile storage survives intact. Database Durability starts with the premise that while hardware failure can make data unavailable, the data is intact and recoverable. (In a subsequent blog post, we will discuss application availability and Durability when this premise is violated.)
Why is Durability Hard to Implement at Scale?
As committing a transaction simply changes blocks in the database file in the filesystem, committing a transaction is theoretically just a matter of writing those blocks. But there are gaps between theory and practice, especially for performance at scale.
- Since the database file is a random access file, updating the file involves writing blocks at multiple offsets in the file. While this may be a lesser performance concern with solid-state storage which, unlike spinning disks, does not have to move heads, random writes on solid state storage are still slower than sequential writes.
- Since a computer can crash in the middle of writing multiple blocks, when a computer is rebooted after a crash, you need to know whether or not all the blocks of a transaction were written. If all of them were written and can be read back, the transaction is committed, but if only some of them were written, then the transaction is not committed and the blocks that were written need to have their prior contents restored.2
- Even when all the blocks are written to the filesystem by the database, owing to buffering by Linux, the data may not immediately be committed to non-volatile storage.
How Does YottaDB Implement Durability at Scale?
All databases ultimately implement Durability through the file system, and there are only a limited number of ways to do this, e.g., by calling fsync(). While we cannot tell you how the myriad other databases use these filesystem APIs, we can tell you how YottaDB implements Durability at scale for ACID transactions.
YottaDB has multiple ways to access database files. This blog post describes the most common Buffered Global (BG) access method with before-image journaling. This technique writes update information to a journal file.
Each database file has a cache of database blocks in a shared memory segment. This cache sits in front of the Linux file buffer cache because it is much faster for processes to read from and write to shared memory than the filesystem. Each database file also has an active journal file that is written to sequentially, i.e., by appending to it.
An epoch is a periodic checkpoint such that if the system crashes immediately after an epoch, the database file has structural integrity and is up to date, i.e., it does not need repair. At an epoch, all dirty (modified) global buffers are written to the filesystem and dirty buffers in the filesystem are written to nonvolatile storage with an fsync(). By default, epochs occur every 300 seconds, but can occur more frequently if YottaDB encounters an earlier opportunity.
There are two types of data records in a journal file: before-image records and update records. The first time that a block in a database file is modified after an epoch, a before-image record of the unmodified block is written to the journal file. Setting or deleting nodes generates update records, with each update record preceded in the journal file by before-image records of the block(s) it is modifying. If multiple updates within an epoch modify the same block, only the first update of the series causes a before-image record of the block to be written to the journal file.
Committing a transaction conceptually consists of two steps:
- Writing the journal records to the journal file, and ensuring that they are committed to nonvolatile storage with an fsync(), ensuring Durability.
- Updating the global buffers for the transaction’s updates. YottaDB processes cooperate to manage the database, and the blocks in global buffers eventually get written to the filesystem, at or before the next epoch.
While Consistency and Isolation are relevant when an application is running, Durability is relevant when recovering from system crashes.
When recovering from a system crash, YottaDB reads the end of the journal file. If it indicates that the system crashed immediately following an epoch, recovery is only a matter of cleaning up some metadata. If the end of the journal file indicates a crash that happened between epochs YottaDB recovers the database, i.e., provides Durability, as follows:
- It finds the end of the last committed transaction from the records in the journal file. Any transaction update records after that are an incomplete transaction, i.e., the system crashed before the transaction was committed, and so the transaction did not happen – Atomicity requires that all updates of a transaction be committed.
- Reading the journal file backward from the end, it reads before-image records, and applies those to the database file, a process referred to as recovery or rollback. When it reaches an epoch, the database file has been restored to the state it had at the epoch, i.e., it is structurally sound and up to date as of the epoch.
- Reading the journal file forward from the epoch, it processes and applies the update records of each transaction to the database. This brings the database up to date with the last committed transaction, thus delivering Durability.
Of course, while conceptually simple, there is virtually unlimited complexity in implementing this Durability mechanism so that it scales. Here are just a few examples:
- While committing a transaction conceptually involves just the two steps listed above, the actual commit code involves multiple phases internally; a straightforward implementation might deliver perhaps half the throughput YottaDB actually delivers.
- As bypassing Linux filesystem buffering for journal files on high-end IO subsystems such as NVMe drives and SANs connected by fiber channels can yield higher write performance, the SYNC_IO option instructs YottaDB to bypass the buffering.
- Since even real-time core-banking systems have batch processes (see 100.000 Foot View of Core-Banking Systems), if a batch process can restart after a crash based on database state, such a batch process may require YottaDB to provide just Acidity, Consistency and Isolation. YottaDB provides such applications with an option to declare a transaction as a “BATCH” transaction that can accept relaxed Durability. Of course, if a subsequent transaction requires Durability, database serialization mandates that YottaDB make all prior transactions Durable.
- A production database usually consists of multiple database files, whose epochs typically happen at different times.
- While database updates that are not within ACID transactions do not demand the same guarantee of Durability, they do need to be made Durable, and this Durability typically happens in a fraction of a second.
Over the years, the YottaDB code base has provided transaction Durability to mission critical applications around the world, in banking & finance, healthcare, and more. We have tried to explain here how YottaDB implements that Durability, and why you should trust it to.
Looking Ahead – When Hardware Failure Damages Or Destroys Filesystems
Hardware failures can damage or destroy filesystems. Failures can also make datacenters unavailable (even datacenters of Amazon, Google, and Microsoft, no matter what any marketing hype says). Our blog post to follow, on Replication, will discuss how YottaDB responds to this need.
In Conclusion
If you have questions, or would like to learn more about how YottaDB transaction processing can meet your transaction processing application’s need to scale while providing five nines availability, please contact us.
Footnotes
- Memory and storage have error rates (e.g., see DRAM Errors in the Wild: A Large-Scale Field Study) that are unacceptable for financial transaction processing. Techniques like mirroring and error-correcting RAM reduce the error rates to acceptable levels. Ideally, the error correction technique would be single-error correction, double error detection so that when uncorrected errors occur, they are detected rather than resulting in incorrect software operation
- Some databases use copy-on-write (CoW) to deal with this requirement, but copy-on-write has scalability limitations, e.g., from Copy-On-Write – When to Use It, When to Avoid It, “CoW is an expensive process if done aggressively. If on every single write, we create a copy then in a system that is write-heavy, things could go out of hand very soon.”
Credits
- Photo of Stone tablet with cuneiform inscription by Dr. Osama Shukir Muhammed Amin used under Creative Commons Attribution-Share Alike 4.0 International. It documents a land purchase by a man named Tupsikka. From Dilbat, Iraq. 2400-2200 BCE.
- Photo of Cuneiform clay tablet by user Zinkir used under Creative Commons Attribution-Share Alike 4.0 International. It complains to the merchant Ea-Nasir about delivery of the wrong grade of copper.
… But YottaDB Does the Hard Work for You
Part 1 discussed what ACID transactions are and why they are hard to scale. This post discusses how YottaDB does the hard work for you.
Concurrency Control
At a very high level, there are two types of concurrency control: pessimistic (known as locking) and optimistic.
In pessimistic concurrency control, as the transaction logic executes, it “locks” the data it accesses. To ensure fully ACID properties, this lock must prevent concurrent processes executing their transaction logic from both reading as well as writing the data. Reading is blocked so that the other processes do not see the intermediate states of data. The consequent performance loss has led databases to provide transaction variants that are not fully ACID, such as multiversion concurrency control (MVCC; also known as “stable reads”) where data a process has read within a transaction will not change for that process as a result of a commit by another transaction. While this suffices for some types of business logic (in the balance transfer example, if there is a concurrent change to a service charge, it is probably acceptable for the transaction to use the prior service charge), it is unacceptable for others (such as two concurrent transactions withdrawing funds from the same account).
In optimistic concurrency control (OCC), a process executing a transaction keeps track of the data that it reads, but does not update the database until the time comes to commit the transaction. If no data that it has read has changed (data that it intends to update must first be read), it commits the transaction; if any data has changed (“collided”), it starts over from the beginning and re-attempts the transaction. Optimistic concurrency control when implemented in hardware, is called transactional memory. Software transactional memory also exists.
The primary pitfall of pessimistic concurrency control is deadlock: if process 1 locks data A and process 2 locks data B, then as they execute program logic process 1 finds it needs B and process 2 finds that it needs A. Deadlocks can of course be more complex: for example, A needs a resource that B has locked, B has a resource that C has locked, and C has a resource that A has locked. In a complex environment where there might be thousands or tens of thousands of concurrent transactions, there might be a deadlock involving a few transactions. The system is busy, and the application is making progress overall, but those few transactions are mutually unable to progress further. Much research has gone into detecting and preventing deadlock, but the problem remains a hard one to solve at scale. We know of at least one database that provides Atomicity and Durability, but makes application software responsible for Consistency and Isolation.
The primary pitfall of optimistic concurrency control is livelock: if two processes update the same data, one of them will update it first. The other will detect a collision and retry the transaction, thus executing the transaction logic twice. At scale, the system is busy, but wasting resources because processes are executing transaction logic multiple times for each successful commit; at least they are not deadlocked.
While most databases use pessimistic concurrency control and ameliorate conflicts with MVCC, MVCC is unacceptable for real-time core-banking systems. YottaDB uses OCC with mechanisms to detect and limit livelock.
YottaDB Concurrency Control
Every database region has a transaction number that is incremented on every update.1 Every database block that is updated during a transaction has its transaction number set to the transaction number of the region, which is incremented in anticipation of the next transaction.
A process inside a transaction notes the transaction number of every database block it reads (a block must be read before it can be updated). Updates are kept in process-private memory. When the process commits the transaction, it checks whether one or more blocks it has read were updated since it read them, and if no block has changed, it commits the transaction. If even one block has changed, it restarts the transaction.
To prevent persistent livelock, if a transaction is forced to restart the transaction thrice, on the fourth attempt it locks other processes from updating the database region(s) in the transaction, and completes the transaction.
As the probability of multiple concurrent transactions updating the same blocks is low, this works well in practice. Nevertheless, YottaDB has mechanisms for applications to detect and avoid livelock. Each database region has a count of the number of first, second, third, and subsequent2 transaction restarts. In the typical scenario of random collisions, one would see a low rate of first restarts, a lower rate of second restarts, and virtually no third restarts. A pattern other than this means that there are pathological restarts resulting from application design that creates “hot” data areas where concurrent transactions update the same data. YottaDB has tools to enable application developers to identify such hot data areas, so that they can avoid pathological restarts.
To avoid a locking problems such as that described in Vendor’s secret ‘fix’ made critical app unusable during business hours, YottaDB also provides for transaction timeouts: a transaction that exceeds the specified time will abort the transaction with an error. Since the transaction will not have been committed, no other concurrent transaction is affected.
YottaDB’s OCC has additional benefits over locking:
- It is computationally easier to restart a transaction that has made no updates than to roll back a transaction that has.
- It is hard for pessimistic concurrency control to lock data that does not exist. With YottaDB’s OCC index blocks are part of the read set of a process executing a transaction. If another process adds a record when a process is inside a transaction, an index block will have a new transaction number, resulting in a restart. In other words, YottaDB transactions ensure Consistency and Isolation not just for the existence of data but also for the absence of data.
- Programming ACID transactions in YottaDB is extremely simple. For example:
- There is no need to WATCH for changed data. YottaDB does the watching automatically with no need for application program action.
- YottaDB automatically rolls back and restarts transactions, carrying them through to being committed, with no explicit programming required on the part of the application.
Yes, but does it really scale?
The proof of the pudding, as they say, is in the eating. YottaDB, and its upstream code base, GT.M (whose transaction processing implementation YottaDB inherits and improves), are live, in daily production use, at the world’s largest real-time core-banking systems, with tens of millions of accounts, and peak ACID transaction rates of tens of thousands of transactions per second.
YottaDB does the hard work for you by providing a clean, simple ACID transaction APIs in a variety of languages.
Contact us to discuss what YottaDB can do for your transaction processing application.
Footnotes
1 YottaDB treats single updates as “mini transactions,” which simplifies both explaining and implementing how the database engine works.
2 There are circumstances under which a transaction can restart more than three times. But that would be getting lost in the weeds for the level of detail of this blog post.
Credits
- Photo of ATMs in Croatia by Gewild who dedicated this work to the public domain.
- Photo of savings bank box of Second National Bank, Allentown PA is in the public domain.
… But YottaDB Does the Hard Work for You
What are ACID transactions?
ACID is an acronym for Atomic, Consistent, Isolated, and Durable.
Consider a transaction to transfer $100 from your savings account to your checking account. At a high level, it comprises at least the following steps:
- If there is sufficient balance in the savings account:
- Deduct $100 from your savings account balance.
- Add $100 to your checking account balance.
- Record information about the transaction, e.g., the origin of the request, authentication, whether it was authorized or denied (and if so, why), any service charges, etc.
Atomicity means that either all the steps need to be executed, or none of them need to be executed – you would be unhappy if the amount was deducted from your savings account but not deposited to your checking account, the bank would be unhappy if the amount was added to your checking account but not deducted from your savings account, the regulators would be unhappy if the transaction was not properly authorized and recorded, etc. If none of the steps happen, and you have enough balance in your savings account, you might be unhappy as a customer, but the bank and you are whole (“in balance”), and you can try again.
Durability means that once the transaction is completed, it is permanently recorded in the database and cannot be erased even if the computer crashes. If the bank later chooses to reverse the transaction, that would need to be a separate, second transaction. Durability will be the subject of a future blog post.
There is a duality to Consistency and Isolation. Consistency is the requirement that from a business perspective, the bank must be in compliance with business rules as determined by application logic. Consider two transactions executing at the same time (concurrently, but more about concurrency later). Both of them change the funds in accounts, and while they are executing, their temporary, private view is that the bank is not in balance (between steps 1a and 1b in the example above).
But when a process looks at any data other than what it is manipulating within the transaction, it should see the the bank as being in balance (Consistency). Isolation means that each transaction is executed as if it were the only transaction on the system. No application logic outside the transaction and executing concurrently with it should be able to see within the transaction, i.e., Isolation of a transaction provides Consistency for concurrent transactions.
So, for example, if there are two simultaneous requests to transfer funds from the savings account, and there are only enough funds for one, it is OK for one to be processed before the other, or the reverse, but each is processed as if it were the only transaction on the accounts. Of course, the second transaction would be rejected for lack of funds. Consistency and Isolation together imply serialization, that there is an order in which transactions are committed.
Why are ACID transactions hard at scale?
ACID transactions are conceptually simple. If an application processes only one transaction at a time, Consistency and Isolation are trivially implemented, while Atomicity and Durability only require a modicum of logic and programming effort. What makes them hard is scale. A large bank will at peak times process thousands to tens of thousands of transactions per second. Transaction processing at scale is challenging. In my thirty years in the database business, it has always been the case that as computers get faster, they are able to handle bigger workloads – but their ability to handle bigger workloads means that bigger workloads are thrown at them and push them to their limits. As the saying goes, “Every day, do more than what is expected of you, and soon more will be expected of you.”
The need to scale means that databases must execute in parallel the logic for multiple transactions, while ensuring ACID properties and serialization. The need to scale requires concurrency control. Concurrency control is the shepherding of the concurrently executing thousands to tens of thousands of transactions to get them committed with ACID properties. Concurrency control is just what make scaling ACID transactions hard.
Imagine you have a drive-through hamburger restaurant. You want to allow each hamburger to be customized, which means you can’t cook the patties in advance. Do you wait till an order is placed and then put patties on the grill, or when a car pulls into the parking lot, do you put a patty on the grill for each person in the car? If the former, then it takes more time to prepare each order. If the latter, you reduce the time to prepare each order, but you will have cooked patties leftover because not every person will order a hamburger.1 Concurrency control to deliver Consistency and Isolation at scale involves similar trade-offs.
Continued in Part 2 …
Part 2 discusses methods for concurrency control, and how YottaDB’s implementation of concurrency control allows it to deliver ACID transactions at scale.
But you don’t need to wait for Part 2. You can Get Started with YottaDB right now!
Footnotes
1 You could perhaps take each day’s leftover patties to make chilli con carne to serve the next day.
Credits
- Photo of transaction in Hanoi by Radek Kucharski used under CC BY 4.0.
- Image of Mandsur Opium Agency Hundi used under CC BY 4.0.
Core-banking systems (CBS) are the legal systems of record for account balances and transactions. For custodians with fiduciary responsibility for their customers’ money, CBS are the most mission-critical applications for commercial banks.
At a high level, there are two types of core-banking systems: batch and real-time.
Batch systems are process-centric, centered around an end-of-day (EOD) process that updates account balances from one day to the next.1 Since transactions can arrive at any time, incoming transactions are placed in a memo file; credits and debits update account balances during EOD processing. This complicates processing logic, since if a debit transaction (e.g., an ATM withdrawal) is received during the day, the previous EOD balance and entries in the memo file must be reviewed to determine if there are sufficient funds in the account, and if there are, another memo must be posted to be processed at EOD. In batch systems, each transaction can be processed multiple times.
Real-time systems are customer-centric. A database has the current balance for each account. When a transaction arrives, it is processed forthwith, and the account balance updated. An incoming debit transaction only needs to be validated against the account balance – there is no memo file to review. Each transaction is processed just once and the account balance in the database is always current. Real-time systems have an additional advantage for financial institutions: since the account balance is always current, there is virtually no need to reconcile discrepancies.2
Batch systems originated on mainframes in the 1960s, with each part of the EOD processing involving the mounting of tapes and the movement of data from one tape to the next. Those days are of course long gone, and data now resides in databases, while batch processes move data within and between databases. Owing to their ability to process large numbers of accounts (accessing data sequentially is more IO efficient than random access), batch systems traditionally ran on expensive mainframes, and were used by the large financial institutions that could afford them. Since there is not necessarily a single point of truth for transactions and balances, banks that use batch systems must also retain staff to reconcile differences.
Real-time systems originated on minicomputers in the 1980s. Owing to their then limited ability to process data, real-time systems were more commonly used by smaller financial institutions. Apart from random access, real-time systems impose an additional constraint on databases: since a transaction is processed just once, straight through to a customer account, databases must support robust Atomic, Consistent, Isolated, and Durable (ACID) transactions and applications must use them correctly.3 Apart from using more affordable minicomputers, not needing staff to reconcile discrepancies was of course attractive to smaller institutions.
In the late 1990s and early 2000s, the massive scalability of the GT.M key-value database coupled with faster new-generation computers such as the DEC Alpha AXP OpenVMS, and UNIX servers from IBM, HP, and Sun, allowed the Profile real-time CBS to break out of its original savings-bank niche, and scale up to the needs of large financial institutions. The database’s ability to provide business continuity in the face of unplanned and planned events, and robust single points of truth for financial data on more affordable servers, delivered scalability with reduced capital, operating, and staff cost, facilitating its segue into global financial institutions.
Today, batch systems are legacy applications, and no financial institution would start implementing a new batch CBS. Yet legacy systems remain, because the cost to a bank of replacing a working CBS is large: replacing a CBS involves reorganizing business workflows to align them with those of the CBS. This also brings up a difference between banks in the United States and countries such as Thailand. Large banks in the US have grown by buying other banks. In such cases, it is often easier to retain the CBS of an acquired bank, implementing a front end application that provides customers with the illusion of a single bank, than to replace the acquired CBS. Large banks in countries like Thailand have grown organically, and thus process tens of millions of accounts on a single system vs. the millions of accounts that a large US bank may process on a single system.
YottaDB is a downstream derivative of GT.M. While staying upward compatible with GT.M, YottaDB adds language independence, better performance, and support for energy efficient ARM CPUs. Placid (Thailand) Ltd. took advantage of the benefits offered by YottaDB to build a greenfield real-time CBS (see Go+YottaDB – A Perfect Platform for Fintech).
Contact us to build your next mission-critical fintech application, or to migrate an existing application to YottaDB.
Footnotes
1 Without getting too much into the weeds, note that the balance in an account may not be a single number: there can be an available balance, a collected balance, etc.
2 Real-time systems also have EOD processes. For example, interest is usually computed on a daily basis. The important distinction is that financial transactions go straight through to customer accounts.
3 Vendor’s secret ‘fix’ made critical app unusable during business hours is an example of what can go wrong if an application uses the less than robust transactions implemented by a database. YottaDB’s transactions are more robust than those of that database.
Credits
- Palazzo Salimbeni houses the main offices of Banca Monte dei Paschi di Siena. Established in 1472, it claims to be the world’s oldest surviving bank. Photo originally by Ray in Manilla used under CC by 2.0.
- Picture of US Continental Congress eight dollar bill is in the public domain.
Given the range of hardware, operating systems, and file systems that we support, our internal network where we build and test YottaDB has a wide range of machines. The principal taxonomy is x86_64 vs. AARCH64 (ARM). In the former category, we have both custom-built PCs as well as off-the-shelf PCs and laptops; in the latter, our machines were all single-board computers (SBCs) until recently, as our view was that YottaDB on AARCH64 would primarily be used in embedded systems.
However, with the foothold of the ARM architecture in high performance computing – for example, the Fujitsu Fugaku supercomputer is seventh on the November 2025 Top 500 list – it seemed worthwhile to investigate AARCH64 systems other than SBCs. To that end, we purchased some refurbished 2023 Apple Mac Mini M2 pro systems and installed Ubuntu Asahi 24.04 LTS on them. Since we compile often during software development, the time to build YDB is a simple and quick way to compare systems, and was the basis for our initial comparison.
To say that we were blown away is putting it mildly. Our fastest machines now (A and B in the table below) are those Apple Mac Mini M2 pro systems. Our traditionally fastest x86_64 machine (D in the table) takes more than thrice as long as the faster Apple, and even the system we are setting up to release the forthcoming r2.04 on RHEL 10 (C in the table below) is almost twice as slow as the faster Mac Mini M2 pro. Our fastest SBC is E in the table below.
Here are the results.
| Machine | Description | CPU | # CPUs | RAM | OS | Clang version | Compile time |
|---|---|---|---|---|---|---|---|
| A | Mac Mini M2 Pro | Blizzard M2 Pro | 12 | 32GB | Ubuntu 24.04.3 LTS | 18.1.3 | 16.213 |
| B | Mac Mini M2 Pro | Blizzard M2 Pro | 10 | 16GB | Ubuntu 24.04.3 LTS | 18.1.3 | 19.191 |
| C | Lenovo Thinkcentre Neo | Intel Core i9-14900 | 32 | 64GB | RHEL 10.1 | 20.1.8 | 25.790 |
| D | Custom built | AMD Ryzen 7 5800X | 16 | 32GB | SLED 15-SP7 | 17.0.6 | 55.816 |
| E | Orange Pi 5 | Rockchip RK3588S | 8 | 16GB | Ubuntu 24.04.3 LTS | 18.1.3 | 110.112 |
In the above:
- Compile times are in seconds, as measured by
time (make -j $(getconf _NPROCESSORS_ONLN) && make install) >compile.log. The reported number is the median of three runs. In practice, there was little variation between the runs. - The number of CPUs was reported by
getconf _NPROCESSORS_ONLN. - All tests were run on an ext4 filesystem on an NVMe drive.
Based on these results, we will no longer consider embedded systems as the primary use case for YottaDB on AARCH64: ARM CPUs are clearly ready for high-end production applications.
Credits
- Photo of Mac Mini M2 from an Apple Inc. web page.
- Video of Ubuntu 24.04 LTS on Apple hardware from Asahi Ubuntu home page.
When Placid (Thailand) Ltd. started working on a greenfield FinTech application that would handle a mission-critical core-banking system at scale, they immediately chose YottaDB. Such a core-banking application requires both high performance as well as uncompromising robustness, and must deliver both at scale with large numbers of concurrent users.
Not all instances in the application use YottaDB, however — instances that handle data that’s used for reporting, for example, may not need to have the performance or the ability to handle concurrency that the core banking system does, and so they can use other databases.
When planning out the application architecture, the decision to use YottaDB as the data store was made even before deciding on the language to write the application in. The need for robustness and performance was the most important consideration, and combined with the fact that the YottaDB code-base has been production-tested for decades, it was the unquestioned choice. Ultimately, the team decided to write the application in Go, because it is a high-performance language and performance is the priority for this application.
Not every developer on the team interacts directly with YottaDB — and for those who don’t have experience with YottaDB’s native API, there would be a learning curve. Placid handles this by having a small, dedicated database team that creates a YottaDB framework that exposes a fintech-friendly API which allows other developers on the team to access YottaDB — they don’t need to know the YottaDB native API. This adds an element of future-proofing, by allowing the framework to be tweaked internally without requiring changes to the financial applications that use it.
As a result, the Placid team is able to combine the high performance, robustness, and scalability to large numbers of concurrent users with YottaDB, which matches the high performance that Go is known for. On an average day, the banking application handles one million customers.
Comsan Chanma, department manager of the SME team, said he’d recommend other teams follow a similar strategy when implementing YottaDB with the language of their choice — and mentioned that the choice to use YottaDB can and should be completely independent of the programming language. But by having a dedicated database team, you’re able to take advantage of YottaDB’s exceptional performance and consistency.
Credits
- Photo of 1,000 baht bill appears to have no copyright.
- Photo of plush Go gopher courtesy the Go Authors and released under the Creative Commons Attribution 3.0 Unported license.