Miscellaneous Knowledge Points Summary about Mybatis

Mybatis uses jdbcType=OTHER for null values. However, Oracle does not support this type, so for this issue, the mapping for null values needs to be changed to jdbcType=NULL.

“$” is used for dynamic SQL concatenation, while “#” is used for precompilation. “$” is not secure, so it is mainly useful when the table name is dynamic.

Within resultMap:

  • “collection” is used for stepwise querying (one-to-many).
  • “association” is used for stepwise querying (many-to-one, one-to-one).
  • “discriminator” is used for custom loading or mapping based on column values.

Within “setting”:

  • “lazyLoadingEnabled” and “aggressiveLazyLoading” can be used to determine when to load data.

OGNL format is used.

The “<where>” tag can automatically remove redundant “and” or “or” at the beginning of the concatenation. “<set>” can be used for updates. For these issues, it is better to use “<trim>”.

When there is only one parameter, “_parameter” represents that parameter. When there are multiple parameters, it is a map.

“<bind>” can be used to create an expression from a bound variable to avoid dynamic creation using “$”, such as for creating “like” expressions.

Use the “<sql>” tag and the “<include>” tag to extract common SQL fragments. Dynamic tags can also be used within the “<sql>” tag.

Situations in Which the First-Level Cache Becomes Invalid:

  • Different SqlSession instances are used.
  • The same SqlSession instance is used, but with different query conditions, meaning that the current first-level cache does not contain the data being queried.
  • The same SqlSession instance is used, but between two queries, an insert, update, or delete operation is performed.
  • The same SqlSession instance is used, but the first-level cache is manually cleared.

Using the Second-Level Cache:

  • Enable the second-level cache by configuring the “setting” tag.
  • Configure the use of the second-level cache in the mapper.xml file using the “cache” tag.
  • Implement the Serializable interface for the POJO that you want to cache

Settings Related to Cache:

  • “cacheEnabled=true/false”: Enables or disables the second-level cache.
  • “useCache=true/false”: Can be used on every “select” tag to enable or disable caching on an individual basis.
  • “flushCache=true” is the default for “update”, “delete”, and “insert” tags, while it is “false” for “select” tags.
  • “sqlSession.clearCache()” only clears the first-level cache.
  • “localCacheScope” can be set to “STATEMENT” or “SESSION”. The first can be used to close the first-level cache for statements

When creating dynamic proxies, proxy objects are created layer by layer according to the order of plugin configurations. When executing the target method, the order of execution is in reverse order.

I’m coming back to the world of databases.

I haven’t studied databases for a while now because I had a feeling that they were slowly becoming less important, and the focus of my work in recent years has been more on front-end and back-end development. It wasn’t until recently that I realized how much my previous knowledge of databases had helped me with my new projects. So, I have started my database learning journey again. Why did I realize that knowledge of databases is so important? It all started with my current project.

Since around 2021, I have been working on a project, or more accurately, three projects that are not small, and all of them are related to the Oracle database. From frontend to backend, all of them are based on the PL/SQL programming language. The frontend technology is an old relic called Oracle Forms, which young people nowadays may not have heard of. This technology is somewhat similar to PowerBuilder or Visual FoxPro in the past, allowing you to quickly develop desktop applications based on the database. Of course, Oracle now has a new replacement technology called Oracle Apex, which allows you to experience one-stack rapid development or low-code development based on Oracle database for web applications. The famous Salesforce was developed using Oracle Apex.

You should know that even in the front-end development of Oracle Forms, PL/SQL is used. Even now, I think Oracle’s idea was great for that era. You can imagine using the PL/SQL database programming language to manipulate interface elements, allowing the interface elements to interact with data in the database and complete various complex data processing tasks. Although this is completely different from the methods based on the Web and Rest API nowadays, it was undoubtedly a pioneering innovation in that era.

My responsibility is to lead a team to rewrite three Oracle Forms-based applications into web and Java-based applications, all of which are related to the insurance industry and deal with cash flow and pension management. Due to the project’s age, long development time, unstable development team, and other reasons, the code is not very clear and the documentation is incomplete. Many documents may still be in a 10-year-old state. This requires me and my team to understand the behavior and intent of the old code before we can begin redeveloping these three applications. This requires a deep understanding of database development.

The characteristics of the financial and insurance industries determine that the data model is usually complex, the amount of data is large, and the consistency of the data is high. For example, in the pension insurance industry, a policy may run for several decades, and there will be many income and expenditure behaviors every month. If you have hundreds of thousands of policies and the system has been running for 20 years or more, the data volume will be very large. When these policyholders reach retirement age, the calculation of how much pension they should receive will be related to their payments over the past 20 years. Therefore, these income and expenditure data are required to be retained for a long time. These data are not only related to the revenue of your enterprise, but also to the policyholders, and even to the tax authorities, because private pension insurance must participate in personal tax calculations.

The complexity of the data model in the finance and insurance industry can result in complex queries, which can cause significant delays in response time when dealing with large volumes of data. The user experience of the front-end is heavily dependent on the database’s responsiveness to queries. This requires you to have a good understanding of the structure of tables and the database’s characteristics, such as query planning. If you can ensure that your query speed is consistently fast, you can save a lot of unnecessary work in designing both the front-end and back-end. This is why I previously mentioned that I gradually realized that my previous database knowledge and experience had been of great help to me in new projects.

I have a scenario where a customer pays a deposit to our company, but they didn’t specify which contract the payment is for when transferring the money to us. In this case, our customer service needs to contact the customer to confirm and record the payment to the correct contract. Some might think this is a small or infrequent issue, but it’s actually very common. I know a colleague whose main job is to handle this kind of situation every day, sometimes manually processing up to 200 such policies. Therefore, a new interface is required to run very smoothly in this situation. I remember when the system was first launched, the interface that handled this issue would inexplicably slow down after running smoothly for one or two weeks. The interface loading time would suddenly increase from 2-3 seconds to 8-9 seconds. Although this problem disappeared automatically when the system was migrated to a new Linux version, I vividly remember my colleague complaining every one to two weeks about her extended work time due to the slow interface. At that time, I always had to quickly reboot the system for her.

My future learning plan will not only involve relational databases such as Oracle, PostgreSQL, and MySQL, but will also involve distributed query engines such as Apache Spark. It is possible that even after 2024, I will have data integration projects where I will need to deal with a variety of data sources. Therefore, I have included Spark in my learning plan.

Quote

(转)Scala方法和函数的区别

Tags

原文

Scala中既有函数也有方法,大多数情况下我们都可以不去理会他们之间的区别。但是有时候我们必须要了解他们之间的不同。

 

Scala中的方法跟Java的方法一样,方法是组成类的一部分。方法有名字、类型签名,有时方法上还有注解,以及方法的功能

实现代码(字节码)。

 

Scala中的函数是一个完整的对象。Scala中用22个特质(trait)抽象出了函数的概念。这22特质从Function1到Function22

如上图中的Function10代表的是:有10个形参,返回值为R(协变)的函数。

Scala中的函数其实就是继承了这些Trait的类的对象,如:我们通过函数字面量定义一个函数

其实上述函数的定义方式跟如下定义方式等同:

由于Function2是特质,不能直接new。上述new Function2[Int,Int,Int](){}其实是定义并实例化一个实现了Function2特质的类的对象。

apply是scala中的语法糖:对一个对象obj上调用obj(),scala编译器会转换为obj.apply();在一个类clazz上调用clazz(),scala编译器会转

换为clazz_company_obj.apply(),其中clazz_company_obj为clazz的伴生对象。

 

具体的差异,总结为如下几点:

1.方法不能作为单独的表达式而存在(参数为空的方法除外),而函数可以。如:

在如上的例子中,我们首先定义了一个方法m,接着有定义了一个函数f。接着我们把函数名(函数值)当作最终表达式来用,由于f本身就是

一个对象(实现了FunctionN特质的对象),所以这种使用方式是完全正确的。但是我们把方法名当成最终表达式来使用的话,就会出错。

2.函数必须要有参数列表,而方法可以没有参数列表

在如上的例子中,m1方法接受零个参数,所以可以省略参数列表。而函数不能省略参数列表

3.方法名是方法条用,而函数名只是代表函数对象本身

 这个比较容易理解。因为保存函数字面量的变量(又称为函数名或者函数值)本身就是实现了FunctionN特质的类的对象,要调用对象的apply

方法,就需要使用obj()的语法。所以函数名后面加括号才是调用函数。如下:

4.在需要函数的地方,如果传递一个方法,会自动进行ETA展开(把方法转换为函数)

如果我们直接把一个方法赋值给变量会报错。如果我们指定变量的类型就是函数,那么就可以通过编译,如下:

当然我们也可以强制把一个方法转换给函数,这就用到了scala中的部分应用函数:

5.传名参数本质上是个方法

传名参数实质上是一个参数列表为空的方法,如下:

如上代码实际上定义了一个方法m1,m1的参数是个传名参数(方法)。由于对于参数为空的方法来说,方法名就是方法调用

,所以List(x,x)实际上是进行了两次方法调用。

由于List(x,x)是进行了两次方法调用,所以得到两个不同的值。

如果我们稍微修改一下函数的m1的定义,把x先缓存起来,结果就会跟以前大不一样。

Learning Apache Cassandra

  • cluster 作为一个整体出现
  • master-follower: master接受所有写操作,写容易单点故障。读没问题
  • keyspace: Cassandra数据库的一个版本
  • replication factor一般是3
  • 文档数据库和关系数据库是侧重于读优化的,而cassandra是侧重于写优化的,实际上不是写而是是插入和追加。并不代表cassandra读差劲
  • cassandra是虽然是key value的形式,但是于其他数据库不同之处在于记录是有结构的。因此应用即可以享受masterless distribution也可享受有结构记录的建模形式。
  • cassandra的secondary index不能用于排序,排序只能在表定义的时候使用clustering columns.读的时候按任意列的顺序是不支持
  • immediate consistency不支持的 (resilient availability差),只支持eventual consistency (resilient availability好). 在定义读写数据路径的时候,貌似可以选的所以叫tunable consistency。
  • cassandra在添加新的值到集合字段的时候不读当前状态,所以快。
  • cassandra不支持join和内置的mapreduce
  • cassandra的表需要主键,没有自增键的方案
  • 列没有null的概念,列没有缺省值和约束
  • text和varchar等价长度没有限制
  • variant可以存储不限制大小的整数类型
  • decimal大小不限制的
  • 时间上只有timestamp类型,要想不要分秒可以用数值类型
  • uuid存储1和4类型的
  • timeuuid存储1类型的uuid可以和timestamp方便转换
  • 1类型的uuid是timestamp和mac地址,4类型的是使用随机或者伪随即数
  • blog的数据用0x….表示
  • sql不能断行
  • cqlsh最多显示1万行
  • 只有一个列的主键的表select结果的顺序是固定的但是不是从客户角度出发的固定
  • 主键的排序不是字面值的排序而是经过token化以后的, 所以where 后要用token函数
  • 除了主键外的列没有索引不上filter
  • 主键实际上是分区键,复合主键可以指定多余一个的分区键,0个或多个clustering column: (primary key (status_update_username, status_update_id), id)
  • 复合键实际上是个parent child关系
  • 做一个表的shadow表可以用原表的主键加一个timuuid来做,原主键分区timeuuid排序
  • 影子表也可以用static列来保存不需要记录的修改的列,这样在一个表里就可以记录修改项。静态列是一个partition内部共享的。
  • timeuuid可以直接用常量表示
  • unixtimestampof(timeuuid)
  • timeuuid的列可以用now来生成
  • 查询的时候必须访问partition key, partition key是hash过的不能比大小,要比也要token后。每个partition做多2个billing行。当然也可以一个分区就一行。
  • mintimeuuid(‘2014-01-01’), maxtimeuuid(‘2015-01-01’)
  • clustering column可以比较大小
  • 除了count(1)和count(*)其他形式是不行的,cassandra里count和直接读记录一样昂贵
  • dateof(timeuuid)
  • 目前order by只能在primary key的clustering column上进行
  • 创建表的时候对clustering column逆向排序 with clustering order by (id desc)
  • cassandra里的分页要分两步
    • 找到表里的第一行
    • 读第一行所在分区里的行,如果是分区里的最后一行还是下个分区
    • 重复以上步骤
  • 要实现autocomplete功能可以创建一个表(prefix, remaining, fulltext)其中prefix是分区键,按照remaining来clustering

Notes of mongoid tutorial

Model class names cannot end with “s”, because it will be considered as the pluralized form of the word. For example “Status” would be considered as the plural form of “Statu”, which will cause a few known problems.

ActiveSupport::Inflector.inflections do |inflect|
  inflect.singular("status", "status")
end

The collection for the model’s documents can be changed at the class level if you would like them persisted elsewhere.

class Person
  include Mongoid::Document
  store_in collection: "citizens", database: "other", client: "secondary"
end

The store_in macro can also take lambdas – a common case for this is multi-tenant applications.

class Band
  include Mongoid::Document
  store_in database: ->{ Thread.current[:database] }
end

In cases where you want to set multiple field values at once, there are a few different ways of handling this as well.

person.attributes = { first_name: "Jean-Baptiste", middle_name: "Emmanuel" }
person.write_attributes(
  first_name: "Jean-Baptiste",
  middle_name: "Emmanuel"
)

When using a field of type Hash, be wary of adhering to the legal key names for mongoDB, or else the values will not store properly.

class Person
  include Mongoid::Document
  field :url, type: Hash

  # all data will fail to save due to the illegal hashkey
  def set_vals_fail
    self.first_name = 'Daniel'
    self.url = {'home.page' => 'http://www.homepage.com'}
    save
  end
end

Be wary that default values that are not defined as lambdas or procs are evaluated at class load time, so the following 2 definitions are not equivalent.

field :dob, type: Time, default: Time.now
field :dob, type: Time, default: ->{ Time.now }

One of the drawbacks of having a schemaless database is that MongoDB must store all field information along with every document, meaning that it takes up a lot of storage space in RAM and on disk. A common pattern to limit this is to alias fields to a small number of characters, while keeping the domain in the application expressive.

class Band
  include Mongoid::Document
  field :n, as: :name, type: String
end

band = Band.new(name: "Placebo")
band.attributes #=> { "n" => "Placebo" }

criteria = Band.where(name: "Placebo")
criteria.selector #=> { "n" => "Placebo" }

You can define custom types in Mongoid and determine how they are serialized and deserialized. You simply need to provide 3 methods on it for Mongoid to call to convert your object to and from MongoDB friendly values.

class Point

  attr_reader :x, :y

  def initialize(x, y)
    @x, @y = x, y
  end

  # Converts an object of this instance into a database friendly value.
  def mongoize
    [ x, y ]
  end

  class << self

    # Get the object as it was stored in the database, and instantiate
    # this custom class from it.
    def demongoize(object)
      Point.new(object[0], object[1])
    end

    # Takes any possible object and converts it to how it would be
    # stored in the database.
    def mongoize(object)
      case object
      when Point then object.mongoize
      when Hash then Point.new(object[:x], object[:y]).mongoize
      else object
      end
    end

    # Converts the object that was supplied to a criteria and converts it
    # into a database friendly form.
    def evolve(object)
      case object
      when Point then object.mongoize
      else object
      end
    end
  end
end

By default Mongoid doesn’t supports dynamic fields. You can tell mongoid that you want to add dynamic fields by including Mongoid::Attributes::Dynamic in model. Mongoid::Attributes::Dynamic will allow attributes to get set and persisted on the document even if a field was not defined for them. There is a slight ‘gotcha’ however when dealing with dynamic attributes in that Mongoid is not completely lenient about the use of method_missing and breaking the public interface of the Document class.

class Person
  include Mongoid::Document
  include Mongoid::Attributes::Dynamic
end
 Mongoid now supports localized fields without the need of any external gem.
class Product
  include Mongoid::Document
  field :description, localize: true
end
I18n.default_locale = :en
product = Product.new
product.description = "Marvelous!"
I18n.locale = :de
product.description = "Fantastisch!"

product.attributes
#=> { "description" => { "en" => "Marvelous!", "de" => "Fantastisch!" }
product.description_translations
#=> { "en" => "Marvelous!", "de" => "Fantastisch!" }
product.description_translations =
  { "en" => "Marvelous!", "de" => "Wunderbar!" }
When using fallbacks, Mongoid will automatically use them when a translation is not available.
config.i18n.fallbacks = true
I18n.default_locale = :en
I18n.fallbacks = true
::I18n.fallbacks[:de] = [ :de, :en, :es ]
product = Product.new
product.description = "Marvelous!"
I18n.locale = :de
product.description #=> "Marvelous!"
There are various ways to view what has been altered on a model. Changes are recorded from the time a document is instantiated, either as a new document or via loading from the database up to the time it is saved. Any persistence operation clears the changes. If no changes have been made, Mongoid will not hit the database on a call to Model#save.
class Person
  include Mongoid::Document
  field :name, type: String
end

person = Person.first
person.name = "Alan Garner"

# Check to see if the document has changed.
person.changed? #=> true

# Get an array of the names of the changed fields.
person.changed #=> [ :name ]

# Get a hash of the old and changed values for each field.
person.changes #=> { "name" => [ "Alan Parsons", "Alan Garner" ] }

# Check if a specific field has changed.
person.name_changed? #=> true

# Get the changes for a specific field.
person.name_change #=> [ "Alan Parsons", "Alan Garner" ]

# Get the previous value for a field.
person.name_was #=> "Alan Parsons"
person = Person.first
person.name = "Alan Garner"

# Reset the changed name back to the original
person.reset_name!
person.name #=> "Alan Parsons"
person = Person.first
person.name = "Alan Garner"
person.save #=> Clears out current changes.

# View the previous changes.
person.previous_changes #=> { "name" => [ "Alan Parsons", "Alan Garner" ] }

You can tell Mongoid that certain attributes are readonly. This will allow documents to be created with theses attributes, but changes to them will be filtered out.

class Band
  include Mongoid::Document
  field :name, type: String
  field :origin, type: String

  attr_readonly :name, :origin
end

band = Band.create(name: "Placebo")
band.update_attributes(name: "Tool") # Filters out the name change.
band.update_attribute(:name, "Tool") # Raises the error.
band.remove_attribute(:name) # Raises the error.
Mongoid supports inheritance in both root and embedded documents. In scenarios where documents are inherited from their fields, relations, validations and scopes get copied down into their child documents, but not vise-versa.
class Canvas
  include Mongoid::Document
  field :name, type: String
  embeds_many :shapes
end
class Shape
  include Mongoid::Document
  field :x, type: Integer
  field :y, type: Integer
  embedded_in :canvas
end
# Builds a Shape object
firefox.shapes.build({ x: 0, y: 0 })
# Builds a Circle object
firefox.shapes.build({ x: 0, y: 0 }, Circle)
# Creates a Rectangle object
firefox.shapes.create({ x: 0, y: 0 }, Rectangle)
Mongoid supplies a timestamping module in Mongoid::Timestamps which can be included to get basic behavior for created_at and updated_at fields.
class Person
  include Mongoid::Document
  include Mongoid::Timestamps
end

You may also choose to only have specific timestamps for creation or modification.

class Person
  include Mongoid::Document
  include Mongoid::Timestamps::Created
end

class Post
  include Mongoid::Document
  include Mongoid::Timestamps::Updated
end

If you want to turn off timestamping for specific calls, use the timeless method:

person.timeless.save
Person.timeless.create!
class Band
  include Mongoid::Document
  include Mongoid::Timestamps::Short # For c_at and u_at.
end

class Band
  include Mongoid::Document
  include Mongoid::Timestamps::Created::Short # For c_at only.
end

class Band
  include Mongoid::Document
  include Mongoid::Timestamps::Updated::Short # For u_at only.
end
Insert a document or multiple documents into the database.
Person.create([
  { first_name: "Heinrich", last_name: "Heine" },
  { first_name: "Willy", last_name: "Brandt" }
])

Person.create(first_name: "Heinrich") do |doc|
  doc.last_name = "Heine"
end
Can bypass validations if wanted
person.save(validate: false)
Update a single attribute, bypassing validations.
person.update_attribute(:first_name, "Jean")

Performs a MongoDB upsert on the document. If the document exists in the database, it will get overwritten with the current attributes of the document in memory. If the document does not exist in the database, it will be inserted. Note that this only runs the {before|after|around}_upsert callbacks.

person = Person.new(
  first_name: "Heinrich",
  last_name: "Heine"
)
person.upsert

Update the document’s updated_at timestamp, optionally with one extra provided time field. This will cascade the touch to all belongs_to relations of the document with the option set. This operation skips validations and callbacks.

person.touch
person.touch(:audited_at)

Deletes the document from the database without running callbacks.

person.delete
 Deletes the document from the database while running destroy callbacks.
person.destroy
 Deletes all documents from the database without running any callbacks.
Person.delete_all

You can change at runtime where to store, query, update, or remove documents by prefixing any operation with #with.

Band.with(database: "music-non-stop").create
Band.with(collection: "artists").delete_all
band.with(client: :tertiary).save!
band = Band.new(name: "Scuba")
band.with(collection: "artists").save!
band.update_attribute(likes: 1000) # This will not save - tries the collection "bands"
band.with(collection: "artists").update_attribute(likes: 1000) # This will update the document.

If you want to switch the persistence context for all operations at runtime, but don’t want to be using with all over your code, Mongoid provides the ability to do this as the client and database level globally. The methods for this are Mongoid.override_client and Mongoid.override_database.

class BandsController < ApplicationController
  before_filter :switch_database
  after_filter :reset_database

  private

  def switch_database
    I18n.locale = params[:locale] || I18n.default_locale
    Mongoid.override_database("my_db_name_#{I18n.locale}")
  end

  def reset_database
    Mongoid.override_database(nil)
  end
end
If you want to drop down to the driver level to perform operations, you can grab the Mongo client or collection from the model or document instance.
client = Band.mongo_client.with(write: { w: 0 }, database: "musik")
client[:artists].find(...)
collection_w_0 = Band.collection.with(write: { w: 0 })
collection_w_0[:artists].find(...)

Mongoid does not provide a mechanism for creating capped collections on the fly

client["name", :capped => true, :size => 1024].create
db.createCollection("name", { capped: true, size: 1024 });
Band.where(name: "Photek").first_or_initialize

Find documents for a provided javascript expression. This will wrap the javascript in a `BSON::Code` object which is the safe way to avoid javascript injection attacks.*

Band.for_js("this.name = param", param: "Tool")
Same as count but caches subsequent calls to the database
Band.length
Band.where(name: "FKA Twigs").size

Eager loaded is supported on all relations with the exception of polymorphic belongs_to associations.

Band.includes(:albums).each do |band|
  p band.albums.first.name # Does not hit the database again.
end
 Update attributes of the first matching document.
Band.where(name: "Photek").update(label: "Mute")
 Update attributes of all matching documents.
Band.where(members: 2).update_all(label: "Mute")
Named scopes can take procs and blocks for accepting parameters or extending functionality.
class Band
  include Mongoid::Document
  field :name, type: String
  field :country, type: String
  field :active, type: Boolean, default: true

  scope :named, ->(name){ where(name: name) }
  scope :active, ->{
    where(active: true) do
      def deutsch
        tap |scope| do
          scope.selector.store("origin" => "Deutschland")
        end
      end
    end
  }
end

Band.named("Depeche Mode") # Find Depeche Mode.
Band.active.deutsch # Find active German bands.

Default scopes can be useful when you find yourself applying the same criteria to most queries, and want something to be there by default. Default scopes take procs that return criteria objects.

class Band
  include Mongoid::Document
  field :name, type: String
  field :active, type: Boolean, default: true

  default_scope ->{ where(active: true) }
end

Band.each do |band|
  # All bands here are active.
end
You can tell Mongoid not to apply the default scope by using unscoped, which can be inline or take a block.
Band.unscoped.where(name: "Depeche Mode")
Band.unscoped do
  Band.where(name: "Depeche Mode")
end
 Note that chaining unscoped with a scope does not work. In these cases, it is recommended that you use the block form of unscoped:
Client.unscoped {   
  Client.created_before(Time.zone.now) 
}

You can also tell Mongoid to explicitly apply the default scope again later to always ensure it’s there.

Band.unscoped.where(name: "Depeche Mode").scoped
If you are using a default scope on a model that is part of a relation, you must reload the relation to have scoping reapplied. This is important to note if you change a value of a document in the relation that would affect its visibility within the scoped relation.
class Label
  include Mongoid::Document
  embeds_many :bands
end

class Band
  include Mongoid::Document
  field :active, default: true
  embedded_in :label
  default_scoped ->{ where(active: true) }
end

label.bands.push(band)
label.bands #=> [ band ]
band.update_attribute(:active, false)
label.bands #=> [ band ] Must reload.
label.reload.bands #=> []
Class methods on models that return criteria objects are also treated like scopes, and can be chained as well.
class Band
  include Mongoid::Document
  field :name, type: String
  field :active, type: Boolean, default: true

  def self.active
    where(active: true)
  end
end

Band.active
 Mongoid provides a DSL around MongoDB’s map/reduce framework, for performing custom map/reduce jobs or simple aggregations.
map = %Q{
  function() {
    emit(this.name, { likes: this.likes });
  }
}

reduce = %Q{
  function(key, values) {
    var result = { likes: 0 };
    values.forEach(function(value) {
      result.likes += value.likes;
    });
    return result;
  }
}

Band.where(:likes.gt => 100).map_reduce(map, reduce).out(inline: true)
Just like criteria, map/reduce calls are lazily evaluated. So nothing will hit the database until you iterate over the results, or make a call on the wrapper that would need to force a database hit.
Band.map_reduce(map, reduce).out(replace: "mr-results").each do |document|
  p document # { "_id" => "Tool", "value" => { "likes" => 200 }}
end
  • inline: 1: Don’t store the output in a collection.
  • replace: "name": Store in a collection with the provided name, and overwrite any documents that exist in it.
  • merge: "name": Store in a collection with the provided name, and merge the results with the existing documents.
  • reduce: "name": Store in a collection with the provided name, and reduce all existing results in that collection.

All relations contain a target, which is the proxied document or documents, a base which is the document the relation hangs off, and metadata which provides information about the relation.

class Person
  include Mongoid::Document
  embeds_many :addresses
end

person.addresses = [ address ]
person.addresses.target # returns [ address ]
person.addresses.base # returns person
person.addresses.metadata # returns the metadata
All relations can have extensions, which provides a way to add application specific functionality to the relation. They are defined by providing a block to the relation definition.
class Person
  include Mongoid::Document
  embeds_many :addresses do
    def find_by_country(country)
      where(country: country).first
    end
    def chinese
      @target.select { |address| address.country == "China" }
    end
  end
end

person.addresses.find_by_country("Mongolia") # returns address
person.addresses.chinese # returns [ address ]

It is important to note that by default, Mongoid will validate the children of any relation that are loaded into memory via a validates_associated.If you do not want this behavior, you may turn it off when defining the relation.

class Person
  include Mongoid::Document

  embeds_many :addresses, validate: false
  has_many :posts, validate: false
end
If you want the embedded document callbacks to fire when calling a persistence operation on its parent, you will need to provide the cascade callbacks option to the relation.
class Band
  include Mongoid::Document
  embeds_many :albums, cascade_callbacks: true
  embeds_one :label, cascade_callbacks: true
end

band.save # Fires all save callbacks on the band, albums, and label.

You can provide dependent options to referenced associations to instruct Mongoid how to handle situations where one side of the relation is deleted, or is attempted to be deleted. The options are as follows:

  • :delete: Delete the child document without running any of the model callbacks.
  • :destroy: Destroy the child document and run all of the model callbacks.
  • :nullify: Orphan the child document.
  • :restrict: Raise an error if the child is not empty.
class Band
  include Mongoid::Document
  has_many :albums, dependent: :delete
  belongs_to :label, dependent: :nullify
end

class Album
  include Mongoid::Document
  belongs_to :band
end

class Label
  include Mongoid::Document
  has_many :bands, dependent: :restrict
end

One core difference between Mongoid and Active Record from a behavior standpoint is that Mongoid does not automatically save child relations for relational associations. This is for performance reasons.

To enable an autosave on a relational association (embedded associations do not need this since they are actually part of the parent in the database) add the autosave option to the relation.
class Band
  include Mongoid::Document
  has_many :albums, autosave: true
end

A document can recursively embed itself using recursively_embeds_one or recursively_embeds_many, which provides accessors for the parent and children via parent_ and child_ methods.

class Tag
  include Mongoid::Document
  recursively_embeds_many
end
 All relations have existence predicates on them in the form of name? and has_name? to check if the relation is blank.
band.label?
band.has_label?
band.albums?
band.has_albums?
One to one relations (embeds_one, has_one) have an autobuild option which tells Mongoid to instantiate a new document when the relation is accessed and it is nil.
class Band
  include Mongoid::Document
  embeds_one :label, autobuild: true
  has_one :producer, autobuild: true
end
Any belongs_to relation, no matter where it hangs off from, can take an optional :touch option which will call the touch method on it and any parent relations with the option defined when the base document calls #touch.
class Band
  include Mongoid::Document
  belongs_to :label, touch: true
end

You can create a one sided many to many if you want to mimic a has_many that stores the keys as an array on the parent.

class Band
  include Mongoid::Document
  belongs_to :label, touch: true
end
You can create a one sided many to many if you want to mimic a has_many that stores the keys as an array on the parent.
class Band
  include Mongoid::Document
  has_and_belongs_to_many :tags, inverse_of: nil
end

class Tag
  include Mongoid::Document
  field :name, type: String
end
Nested attributes can be enabled for any relation, embedded or referenced. To enable this for the relation, simply provide the relation name to the accepts_nested_attributes_for macro.
class Band
  include Mongoid::Document
  embeds_many :albums
  belongs_to :producer
  accepts_nested_attributes_for :albums, :producer
end
Note that when you add nested attributes functionality to a referenced relation, Mongoid will automatically enable autosave for that relation.
When a relation gains nested attributes behavior, an additional method is added to the base model, which should be used to update the attributes with the new functionality. This method is the relation name plus _attributes=
band = Band.first
band.producer_attributes = { name: "Flood" }
band.attributes = { producer_attributes: { name: "Flood" }}
Note that this will work with any attribute based setter method in Mongoid. This includes: update_attributes, update_attributes! and attributes=.
Note that using callbacks for domain logic is a bad design practice, and can lead to unexpected errors that are hard to debug when callbacks in the chain halt execution. It is our recommendation to only use them for cross-cutting concerns, like queueing up background jobs.
class Article
  include Mongoid::Document
  field :name, type: String
  field :body, type: String
  field :slug, type: String

  before_create :send_message

  after_save do |document|
    # Handle callback here.
  end

  protected
  def send_message
    # Message sending code here.
  end
end

Callbacks are coming from Active Support, so you can use the new syntax as well:

class Article
  include Mongoid::Document
  field :name, type: String

  set_callback(:create, :before) do |document|
    # Message sending code here.
  end
end

Mongoid has a set of callbacks that are specific to collection based relations – these are:

  • after_add
  • after_remove
  • before_add
  • before_remove
class Person
  include Mongoid::Document

  has_many :posts, after_add: :send_email_to_subscribers
end

def send_email_to_subscribers(post)
  Notifications.new_post(post).deliver
end
You can define indexes on documents using the index macro. Provide the key for the index along with a direction. For additional options, supply them in a second options hash parameter.
class Person
  include Mongoid::Document
  field :ssn

  index({ ssn: 1 }, { unique: true, name: "ssn_index" })
end
You can define indexes on embedded document fields as well.
class Person
  include Mongoid::Document
  embeds_many :addresses
  index "addresses.street" => 1
end
Sparse indexes only contain entries for documents that have the indexed field, even if the index field contains a null value. The index skips over any document that is missing the indexed field. The index is “sparse” because it does not include all documents of a collection.
class Person
  include Mongoid::Document
  field :ssn

  index({ ssn: -1 }, { sparse: true })
end
By default, MongoDB builds indexes in the foreground, which prevents all read and write operations to the database while the index builds. Also, no operation that requires a read or write lock on all databases (e.g. listDatabases) can occur during a foreground index build.
class Person
  include Mongoid::Document
  field :ssn
  index({ ssn: 1 }, { unique: true, background: true })
end

For unique indexes that are defined for a column that might already have duplicate values, you can drop the duplicate entries:

class Person
  include Mongoid::Document
  field :ssn
  index({ ssn: 1 }, { unique: true, drop_dups: true })
end

Indexes can be scoped to a specific database.

class Person
  include Mongoid::Document
  field :ssn
  index({ ssn: 1 }, { database: "users", unique: true, background: true })
end

You can have Mongoid define indexes for you on “foreign key” fields for relational associations. This only works on the relation macro that the foreign key is stored on.

class Comment
  include Mongoid::Document
  belongs_to :post, index: true
  has_and_belongs_to_many :preferences, index: true
end

When you want to create the indexes in the database, use the provided rake task.

$ rake db:mongoid:create_indexes

Mongoid also provides a rake task to delete all secondary indexes.

$ rake db:mongoid:remove_indexes

In order to properly set up single collection inheritance, Mongoid needs to preload all models before every request in development mode. This can get slow, so if you are not using any inheritance it is recommended you turn this feature off.

config.mongoid.preload_models = false

Mongoid provides the following rake tasks when used in a Rails 3 environment:

  • db:create: Exists only for dependency purposes, does not actually do anything.
  • db:create_indexes: Reads all index definitions from the models and attempts to create them in the database.
  • db:remove_indexes: Reads all secondary index definitions from the models.
  • db:drop: Drops all collections in the database with the exception of the system collections.
  • db:migrate: Exists only for dependency purposes, does not actually do anything.
  • db:purge: Deletes all data, including indexes, from the database. Since 3.1.0
  • db:schema:load: Exists only for framework dependency purposes, does not actually do anything.
  • db:seed: Seeds the database from db/seeds.rb
  • db:setup: Creates indexes and seeds the database.
  • db:test:prepare: Exists only for framework dependency purposes, does not actually do anything.

find_and_modify has been removed and replaced with 3 options: find_one_and_update, find_one_and_delete and find_one_and_replace.

Quote

What? class << self

Tags

Original

First, the class << foo syntax opens up foo‘s singleton class (eigenclass). This allows you to specialise the behaviour of methods called on that specific object.

a = 'foo'
class << a
  def inspect
    '"bar"'
  end
end
a.inspect   # => "bar"

a = 'foo'   # new object, new singleton class
a.inspect   # => "foo"

Now, to answer the question: class << self opens up self‘s singleton class, so that methods can be redefined for the current self object (which inside a class or module body is the class or module itself). Usually, this is used to define class/module (“static”) methods:

class String
  class << self
    def value_of obj
      obj.to_s
    end
  end
end

String.value_of 42   # => "42"

This can also be written as a shorthand:

class String
  def self.value_of obj
    obj.to_s
  end
end

Or even shorter:

def String.value_of obj
  obj.to_s
end

When inside a function definition, self refers to the object the function is being called with. In this case, class << self opens the singleton class for that object; one use of that is to implement a poor man’s state machine:

class StateMachineExample
  def process obj
    process_hook obj
  end

private
  def process_state_1 obj
    # ...
    class << self
      alias process_hook process_state_2
    end
  end

  def process_state_2 obj
    # ...
    class << self
      alias process_hook process_state_1
    end
  end

  # Set up initial state
  alias process_hook process_state_1
end

So, in the example above, each instance of StateMachineExample has process_hook aliased to process_state_1, but note how in the latter, it can redefine process_hook (for self only, not affecting other StateMachineExample instances) to process_state_2. So, each time a caller calls the process method (which calls the redefinable process_hook), the behaviour changes depending on what state it’s in.

data types with variable length cause row migrations

some datatypes like number and varchar2 use fewer storage but they can cause row migrations. for number:

select dump(0) length from dual;

Typ=2 Len=1: 128

select dump(1) length from dual;

Typ=2 Len=2: 193,2

select dump(10000000000000) length from dual;
Typ=2 Len=2: 199,11

select dump(111) length from dual;
Typ=2 Len=3: 194,2,12

As you can see the numer has the compression with it. Some large number like 10000000000000 uses only two bytes but 111 uses one byte more. 0 takes only one byte but 1 takes two bytes. If a status column changes from 0 to 1 and there is no free space in the block, a row migration will take place.