原文地址:
https://blog.dgraph.io/post/v0.9/
At Dgraph, we really care about user feedback. Most of what we’ve built starting January 2017, has been based what our community (that’s you!) told us. The biggest contribution that we get from our community, is in the form of feedback. We’ll forgo any code contribution for quality feedback based on real-world usage.
在构造Dgraph上,我们真的关心用户的反馈。从2017年1月开始,我们构建的大多数都是基于我们社区的人(就是你们)告诉我们的。我们从社区获得的最大的贡献就是以反馈的形式得到的。根据实际应用的质量反馈,我们会放弃任何贡献的代码。
Since the beginning of Dgraph, transactions were road mapped as a post v1.0 feature. Dgraph is a distributed and synchronously replicated system. Adding transactions in such a system is a hard challenge; something that we felt wasn’t worth the complexity to tackle early on.
Dgraph从一开始,事务就在1.0版本特性的路上了。Dgraph是一个分布式的、同步复制系统。
在这样的一个系统上添加事务是一个艰难的挑战。我们感觉过早涉及这么复杂的东西不值得。
That changed when Gustavo Niemeyer filed this issue. In this issue, he made a very convincing case for supporting transactions sooner rather than later.
当Gustavo Niemeyer提出这个问题的时候我们的想法变了。在这个问题上,
他提出了一个非常有说服力的案例来说明支持事务越早越好。
So, coming back to Dgraph, if the idea is indeed to position it as a general alternative for existing databases as I’ve watched in one of your videos, please don’t make the same mistake of postponing transactions for too long, or voicing them as relevant mainly for financial transactions. The sort of consistency at stake is relevant for pretty much any application at all that uses data, even more when even basic details about a record are recorded as multiple individual terms. This would make the situation even worse than with MongoDB. –Gustavo
所以,回到Dgraph上。如果这个想法确实是将其定位为现有数据库的通常选项,就像我在你们的其中一个视频里看到的,那么请不要再犯同样一个错误——事务的实现推迟的太长,或者声称这些事务主要还是跟金融交易相关的。这种利害攸关的一致性问题绝对关系到几乎任何使用数据的应用程序,而且有关record的基本细节被记录为多个单独的项时更是如此。这会使得情况比MongoDB更糟。
The arguments made by Gustavo were intriguing enough for us to look seriously in that direction. Once we started looking, evidence was everywhere. People have been complaining about lack of transactions in MongoDB. Bigtable author and Google Fellow, Jeff Dean, considered not implementing transactions in Bigtable his “biggest mistake as an engineer”.
Gustavo提出的论点非常有趣,足以让我们认真地看待这个方向。一旦我们开始寻找证据,证据随处可见。人们一直在抱怨MongoDB缺少事务。BigTable的作者,谷歌的同事Jeff Dean认为Bigtable没有实现事务是他“作为一个工程师最大的错误”。
It was clear that transactions were something that we should implement right away.
显然事务是我们需要立马实现的东西。
So that’s what we did. We used what we call the Blitzkrieg approach. Explaining it can be a blog post of its own, but the general idea is that a single or a very small set of engineers work initially to make changes deep into the core, which would break most things minus the core (possibly one package). And then the rest of the team helps fix up the outer shells level by level. We use this technique regularly to implement major design changes at a lightning speed.
所以我们就做了那个。我们使用了我们叫做Blitzkrieg方法的东西。它自己的一篇博客可以解释它,但大意就是,一个或一小撮工程师着手深入改变核心(部分),这会使得大多数东西从核心中裁掉(可能是一个包)。然后团队的其他人员帮忙逐层修复外壳。
我们经常使用这种技术以闪电般的速度实现主要的设计变更。
The entire work from the reporting of the issue (Sep 13) to implementing transactions in Badger (Oct 5), to releasing v0.9 with transactions (on Nov 14), was done within a time span of two months.
从问题报告(9月13日)到在Badger(10月5日)中实现事务,再到发布带事务的0.9版本(11月14日),整个工作都是在两个月的时间内完成的。
(笔者注:badger是个KV数据库专门针对SSD优化过)
Wow.. that's an amazing turnaround time for that level of complexity. Thank you! https://twitter.com/manishrjain/status/930340309954789382 …
哇,对于这个级别的复杂度来说,那真是个惊人的周转时间啊。
In this blog post, we won’t go into the details of how Dgraph’s transactions work. There’s a lot of interesting bits there, due to the uniqueness of this challenge; so the team decided that a blog post won’t do justice to what has gone into building this amazingly distributed graph database over the past two years. Instead, we plan to write a technical paper about Dgraph’s unique design. Watch for that in the coming months!
这篇博客里,我们不会详细介绍Dgraph事务是怎样工作的。由于这个挑战的独特性,这里有很多有趣的部分。所以我们团队觉得一篇博客彰显不出过去两年里构建的这个分布式图数据库有多惊人。
So, instead of how it works, this blog post focuses on how to use transactions to build your application.
所以,这篇博客专注于说明怎样使用事务去构建你的应用程序而不是说清楚事务是怎样工作的。
The transaction model
事务的模型
Transactions come along with a new model for how to interact with Dgraph. Previously, it has just been single queries and data mutations on their own. Now all queries and mutations are performed as part of a transaction.
对于如何与Dgraph交互,事务带来了一个新的模型。 以前,查询和修改数据都只是它们自己单个的事务。 现在所有查询和修改都是作为事务的一部分执行的。
Dgraph can perform read-modify-write transactions, the typical lifecycle being:
Dgraph可以执行读取-修改-写入事务,典型的生命周期是:
1.Create a transaction. Go client: client.NewTxn().
2.Execute a series of queries and mutations. Go client: txn.Mutate(...) and txn.Query(...).
3.Finally, commit or abort the transaction. Go client: txn.Commit() and txn.Abort().
1.创建一个事务。Go客户端: client.NewTxn().
2.执行一系列的查询和修改。Go客户端: txn.Mutate(...) and txn.Query(...)
3.最后,提交或者终止事务。Go客户端:txn.Commit() and txn.Abort()
If two concurrently running transactions write to the same data, then one of the commits will fail. It’s up to the user to retry.
如果两个运行的事务同时写同一条数据,那么其中的一个会提交失败。用户来决定是否需要重试。
Why are transactions important?
为什么事务这么重要?
Database transactions are important for any app that needs to update its state based upon its previous state in a consistent manner or has operations that need to apply in atomic units.
数据库事务对于任何需要以一致的方式基于之前的状态更新其状态或具有需要以原子单元来操作的应用程序而言都非常重要。
That covers a lot of different things. Just to name a few:
这包括了很多不同的东西,仅举几个例子
- Marking inventory in an online shop as sold. You wouldn’t want to sell the last remaining item to two customers.
- Paying out bets on an online poker site. It’s important to ensure the same win isn’t paid twice.
- Inventory management for a warehouse. Restocking an item twice without seeing the new quantity could result in twice as much held stock as intended.
- Financial transactions. When transferring money, it’s important that credits and debts on two accounts are either both applied or not applied at all.
- 在网上商店中标记出售的库存。 你不会想把剩下的最后一件物品卖给两个顾客。
- 在线扑克网站上投注。 确保同一个赢家不支付两次是很重要滴
- 仓库的库存管理。 在不看到新数量的情况下重新存货两次可能会导致持有库存数量达到预期数量的两倍。
- 金融交易。 在转账时,重要的是转出账户和转入账户要么两个都成功,要么两个都不成功。
Dgraph v0.9 introduces distributed ACID transactions with synchronous replication. What this means is that transactions work across multiple servers each holding a part of the graph, providing ACID guarantees.
Dgraph v0.9引入了带有同步复制的分布式ACID事务。 这意味着事务处理跨越多个服务器时,每个服务器都持有图的一部分,从而提供ACID保证。
Increasing throughput is still just a matter of bringing up additional dgraph instances. There is no need to worry about seeing a previous database state when querying a replica. From the point of view of a single client, once a transaction is committed its changes are guaranteed to be visible in all future transactions. These guarantees help simplify application code significantly while providing a high level of scalability and crash resilience.
增加吞吐量也只是额外增加dgraph实例的事儿。查询副本时无需担心会看到以前的数据库状态。从单个客户端来看,一旦提交了事务,其变更将保证在所有未来的事务中都可见。这些保证有助于显著简化应用程序代码,同时提供高级别的可伸缩性和崩溃恢复能力。
Client Libraries
客户端库
Dgraph exposes its API via gRPC and HTTP. However…
Dgraph通过gRPC和HTTP暴露它的api,然鹅。。。
Transactions require some bookkeeping and state management on the client side. Because of this, it’s strongly recommended to use a client library to interact with dgraph.
事务需要客户端的一些簿记和状态管理。 因此,强烈建议使用客户端库与dgraph进行交互。
At the time of writing, official Go, Java and a community-driven Javascript clients are available.
在撰写本文时,官方的Go,Java和社区驱动的Javascript客户端都可以用。
Client libraries for other languages can be implemented on top of the gRPC or HTTP APIs. The best way to approach this is to read the documentation about how to use the raw HTTP API and look at the implementations for other existing clients.
其他语言的客户端库可以在gRPC或HTTP API上实现。 解决此问题的最佳方法是阅读有关如何使用原始HTTP API的文档,并查看其他现有客户端的实现。
The examples in this blog post will use the Go client.
这篇博文中的例子将使用Go客户端。
A simple login system
一个简单的登录系统
Prior to v0.9.0 dgraph had an upsert feature which is now removed. Upsert atomically searches and retrieves or creates and retrieves depending on whether an entity exists or not.
在v0.9.0之前,dgraph有一个upsert功能,现在它已被删除。 Upsert根据实体是否存在自动搜索和检索或创建和检索。
With transactions, an explicit upsert feature is no longer required. This is because upsert style operations can be performed atomically within a transaction.
通过事务处理,不再需要显式的upsert特性了。 这是因为upsert风格的操作可以在事务中以原子的方式执行。
So how is this done?
那么这是怎么做到的?
In this example, we model a simple login system, where a user has to provide an email address and password in order to gain access to the system.
在这个例子中,我们建一个简单的登录系统模型,用户必须提供一个电子邮件地址和密码才能访问系统。
If the user already exists, then the password must match. If the user doesn’t yet exist, then their password should be stored for later logins.
如果用户已经存在,那么密码必须匹配。 如果用户还不存在,那么他们的密码应该存储供以后登录。
It’s important to do all of this in a transaction. If it’s not, then the same account might inadvertently be created twice.
在事务中完成所有这一切很重要。 否则,则同一个帐户可能会无意中创建两次。
Error checking and JSON marshalling/unmarshalling have been omitted for brevity:
为简洁起见,错误检查和JSON编组/解组已被省略:
// Create a new transaction. The deferred call to Discard
// ensures that server-side resources are cleaned up.
txn := client.NewTxn()
defer txn.Discard(ctx)
// Create and execute a query to looks up an email and checks if the password
matches.
q := fmt.Sprintf(`
{
login_attempt(func: eq(email, %q)) {
checkpwd(pass, %q)
}
}
`, email, pass)
resp, err := txn.Query(ctx, q)
// Unmarshal the response into a struct. It will be empty if the email couldn't
// be found. Otherwise it will contain a bool to indicate if the password matched.
var login struct {
Account []struct {
Pass []struct {
CheckPwd bool `json:"checkpwd"`
} `json:"pass"`
} `json:"login_attempt"`
}
err = json.Unmarshal(resp.GetJson(), &login)
// Now perform the upsert logic.
if len(login.Account) == 0 {
fmt.Println("Account doesn't exist! Creating new account.")
mu := &protos.Mutation{
SetJson: []byte(fmt.Sprintf(`{ "email": %q, "pass": %q }`, email, pass)),
}
_, err = txn.Mutate(ctx, mu)
// Commit the mutation, making it visible outside of the transaction.
err = txn.Commit(ctx)
} else if login.Account[0].Pass[0].CheckPwd {
fmt.Println("Login successful!")
} else {
fmt.Println("Wrong email or password.")
}
Bank Account Transfers
银行账户转账
The classical example for database transactions is to transfer money between two bank accounts. In this example, we have a set of bank accounts, each represented by a node in the graph. Each node is known by a uid and has its balance represented by a bal predicate.
数据库事务的典型例子是在两个银行账户之间转账。 在这个例子中,我们有一组银行账户,每个账户都由图中的一个节点表示。 每个节点都被一个uid所标识,并且其余额由一个bal
断言来表示。
This example was extracted from a tool we used when testing the correctness of our transaction implementation. The full source is here.
这个例子是从我们测试事务的正确性所使用的工具中抽取出来的。 完整的源代码在这里
Given the uid of two accounts, we want to transfer money from one account to the other, i.e. reduce one balance and increase the other by the same amount.
给定两个账户的uid,我们希望将资金从一个账户转移到另一个账户,即减少一个余额和增加另一个账户的余额。
It’s important that this is done in a transaction; if it isn’t, then two transfers happening concurrently could result in the net balance of all accounts changing. It could also result in double spending.
要在事务里完成,这很重要。 不然的话,两笔转账同时发生可能会导致所有账户的净余额发生变化。 这也可能导致双重开支。
txn := s.dg.NewTxn()
defer txn.Discard(ctx)
// Get current balances for the two accounts.
q := fmt.Sprintf(`{both(func: uid(%s, %s)) { uid, bal }}`, from, to)
resp, err := txn.Query(ctx, q)
type Accounts struct {
Both []Account `json:"both"`
}
var a Accounts
err := json.Unmarshal(resp.Json, &a)
// Perform the transfer.
a.Both[0].Bal += 5
a.Both[1].Bal -= 5
if a.Both[0].Bal < 0 || a.Both[1].Bal < 0 {
// Abandon the transaction if there are insufficient funds.
return
}
// Write back to dgraph.
var mu protos.Mutation
data, err := json.Marshal(a.Both)
mu.SetJson = data
_, err = txn.Mutate(ctx, &mu)
err = txn.Commit(ctx)
Conclusion
It has historically been difficult to implement transactions in NoSQL technologies. Notably, MongoDB has been working on a solution for a while.
So implementing transactions with synchronous replication is a massive milestone for Dgraph. With this complex but valuable feature, our community will be able to build apps on top of dgraph without having to worry about tricky data integrity issues!
We are building a fast, transactional and distributed graph database.
在NoSQL技术中实现事务历来很困难。 值得注意的是,MongoDB一直在研究解决方案。
因此,使用同步复制来实现事务对于Dgraph来说是一个巨大的里程碑。 有了这个复杂而有价值的功能,我们的社区将能够在dgraph之上构建应用程序,而不必担心棘手的数据完整性问题!
我们正在构建一个快速的、事务的分布式图数据库 | |
Get started with Dgraph. | https://docs.dgraph.io |
See our live demo. | https://dgraph.io |
Star us on Github. | https://github.com/dgraph-io/dgraph |
Ask us questions. | https://discuss.dgraph.io |