Retrieving Data & Results Sets

  • class Cake\ORM\Table

While table objects provide an abstraction around a ‘repository’ or collectionof objects, when you query for individual records you get ‘entity’ objects.While this section discusses the different ways you can find and load entities,you should read the Entities section for more information onentities.

Debugging Queries and ResultSets

Since the ORM now returns Collections and Entities, debugging these objects canbe more complicated than in previous CakePHP versions. There are now variousways to inspect the data returned by the ORM.

  • debug($query) Shows the SQL and bound parameters, does not show results.
  • sql($query) Shows the final rendered SQL, but only when having DebugKit installed.
  • debug($query->all()) Shows the ResultSet properties (not the results).
  • debug($query->toList()) An easy way to show each of the results.
  • debug(iterator_to_array($query)) Shows query results in an array format.
  • debug(json_encode($query, JSON_PRETTY_PRINT)) More human readable results.
  • debug($query->first()) Show the properties of a single entity.
  • debug((string)$query->first()) Show the properties of a single entity as JSON.

Getting a Single Entity by Primary Key

  • Cake\ORM\Table::get($id, $options = [])

It is often convenient to load a single entity from the database when editing orviewing entities and their related data. You can do this by using get():

  1. // In a controller or table method.
  2.  
  3. // Get a single article
  4. $article = $articles->get($id);
  5.  
  6. // Get a single article, and related comments
  7. $article = $articles->get($id, [
  8. 'contain' => ['Comments']
  9. ]);

If the get operation does not find any results aCake\Datasource\Exception\RecordNotFoundException will be raised. You caneither catch this exception yourself, or allow CakePHP to convert it into a 404error.

Like find(), get() also has caching integrated. You can use thecache option when calling get() to perform read-through caching:

  1. // In a controller or table method.
  2.  
  3. // Use any cache config or CacheEngine instance & a generated key
  4. $article = $articles->get($id, [
  5. 'cache' => 'custom',
  6. ]);
  7.  
  8. // Use any cache config or CacheEngine instance & specific key
  9. $article = $articles->get($id, [
  10. 'cache' => 'custom', 'key' => 'mykey'
  11. ]);
  12.  
  13. // Explicitly disable caching
  14. $article = $articles->get($id, [
  15. 'cache' => false
  16. ]);

Optionally you can get() an entity using Custom Finder Methods. Forexample you may want to get all translations for an entity. You can achieve thatby using the finder option:

  1. $article = $articles->get($id, [
  2. 'finder' => 'translations',
  3. ]);

Using Finders to Load Data

  • Cake\ORM\Table::find($type, $options = [])

Before you can work with entities, you’ll need to load them. The easiest way todo this is using the find() method. The find method provides an easy andextensible way to find the data you are interested in:

  1. // In a controller or table method.
  2.  
  3. // Find all the articles
  4. $query = $articles->find('all');

The return value of any find() method is alwaysa Cake\ORM\Query object. The Query class allows you to furtherrefine a query after creating it. Query objects are evaluated lazily, and do notexecute until you start fetching rows, convert it to an array, or when theall() method is called:

  1. // In a controller or table method.
  2.  
  3. // Find all the articles.
  4. // At this point the query has not run.
  5. $query = $articles->find('all');
  6.  
  7. // Iteration will execute the query.
  8. foreach ($query as $row) {
  9. }
  10.  
  11. // Calling all() will execute the query
  12. // and return the result set.
  13. $results = $query->all();
  14.  
  15. // Once we have a result set we can get all the rows
  16. $data = $results->toList();
  17.  
  18. // Converting the query to a key-value array will also execute it.
  19. $data = $query->toArray();

Note

Once you’ve started a query you can use the Query Builderinterface to build more complex queries, adding additional conditions,limits, or include associations using the fluent interface.

  1. // In a controller or table method.
  2. $query = $articles->find('all')
  3. ->where(['Articles.created >' => new DateTime('-10 days')])
  4. ->contain(['Comments', 'Authors'])
  5. ->limit(10);

You can also provide many commonly used options to find(). This can helpwith testing as there are fewer methods to mock:

  1. // In a controller or table method.
  2. $query = $articles->find('all', [
  3. 'conditions' => ['Articles.created >' => new DateTime('-10 days')],
  4. 'contain' => ['Authors', 'Comments'],
  5. 'limit' => 10
  6. ]);

The list of options supported by find() are:

  • conditions provide conditions for the WHERE clause of your query.
  • limit Set the number of rows you want.
  • offset Set the page offset you want. You can also use page to makethe calculation simpler.
  • contain define the associations to eager load.
  • fields limit the fields loaded into the entity. Only loading some fieldscan cause entities to behave incorrectly.
  • group add a GROUP BY clause to your query. This is useful when usingaggregating functions.
  • having add a HAVING clause to your query.
  • join define additional custom joins.
  • order order the result set.

Any options that are not in this list will be passed to beforeFind listenerswhere they can be used to modify the query object. You can use thegetOptions() method on a query object to retrieve the options used. Whileyou can pass query objects to your controllers, we recommend that you packageyour queries up as Custom Finder Methods instead. Using custom findermethods will let you re-use your queries and make testing easier.

By default queries and result sets will return Entities objects. Youcan retrieve basic arrays by disabling hydration:

  1. $query->disableHydration();
  2.  
  3. // $data is ResultSet that contains array data.
  4. $data = $query->all();

Getting the First Result

The first() method allows you to fetch only the first row from a query. Ifthe query has not been executed, a LIMIT 1 clause will be applied:

  1. // In a controller or table method.
  2. $query = $articles->find('all', [
  3. 'order' => ['Articles.created' => 'DESC']
  4. ]);
  5. $row = $query->first();

This approach replaces find('first') in previous versions of CakePHP. Youmay also want to use the get() method if you are loading entities by primarykey.

Note

The first() method will return null if no results are found.

Getting a Count of Results

Once you have created a query object, you can use the count() method to geta result count of that query:

  1. // In a controller or table method.
  2. $query = $articles->find('all', [
  3. 'conditions' => ['Articles.title LIKE' => '%Ovens%']
  4. ]);
  5. $number = $query->count();

See Returning the Total Count of Records for additional usage of the count() method.

Finding Key/Value Pairs

It is often useful to generate an associative array of data from yourapplication’s data. For example, this is very useful when creating <select>elements. CakePHP provides a simple to use method for generating ‘lists’ ofdata:

  1. // In a controller or table method.
  2. $query = $articles->find('list');
  3. $data = $query->toArray();
  4.  
  5. // Data now looks like
  6. $data = [
  7. 1 => 'First post',
  8. 2 => 'Second article I wrote',
  9. ];

With no additional options the keys of $data will be the primary key of yourtable, while the values will be the ‘displayField’ of the table. You can use thesetDisplayField() method on a table object to configure the display field ofa table:

  1. class ArticlesTable extends Table
  2. {
  3. public function initialize(array $config): void
  4. {
  5. $this->setDisplayField('title');
  6. }
  7. }

When calling list you can configure the fields used for the key and valuewith the keyField and valueField options respectively:

  1. // In a controller or table method.
  2. $query = $articles->find('list', [
  3. 'keyField' => 'slug',
  4. 'valueField' => 'title'
  5. ]);
  6. $data = $query->toArray();
  7.  
  8. // Data now looks like
  9. $data = [
  10. 'first-post' => 'First post',
  11. 'second-article-i-wrote' => 'Second article I wrote',
  12. ];

Results can be grouped into nested sets. This is useful when you wantbucketed sets, or want to build <optgroup> elements with FormHelper:

  1. // In a controller or table method.
  2. $query = $articles->find('list', [
  3. 'keyField' => 'slug',
  4. 'valueField' => 'title',
  5. 'groupField' => 'author_id'
  6. ]);
  7. $data = $query->toArray();
  8.  
  9. // Data now looks like
  10. $data = [
  11. 1 => [
  12. 'first-post' => 'First post',
  13. 'second-article-i-wrote' => 'Second article I wrote',
  14. ],
  15. 2 => [
  16. // More data.
  17. ]
  18. ];

You can also create list data from associations that can be reached with joins:

  1. $query = $articles->find('list', [
  2. 'keyField' => 'id',
  3. 'valueField' => 'author.name'
  4. ])->contain(['Authors']);

Customize Key-Value Output

Lastly it is possible to use closures to access entity accessor methods in yourlist finds.

  1. // In your Authors Entity create a virtual field to be used as the displayField:
  2. protected function _getLabel()
  3. {
  4. return $this->_properties['first_name'] . ' ' . $this->_properties['last_name']
  5. . ' / ' . __('User ID %s', $this->_properties['user_id']);
  6. }

This example shows using the _getLabel() accessor method fromthe Author entity.

  1. // In your finders/controller:
  2. $query = $articles->find('list', [
  3. 'keyField' => 'id',
  4. 'valueField' => function ($article) {
  5. return $article->author->get('label');
  6. }
  7. ]);

You can also fetch the label in the list directly using.

  1. // In AuthorsTable::initialize():
  2. $this->setDisplayField('label'); // Will utilize Author::_getLabel()
  3. // In your finders/controller:
  4. $query = $authors->find('list'); // Will utilize AuthorsTable::getDisplayField()

Finding Threaded Data

The find('threaded') finder returns nested entities that are threadedtogether through a key field. By default this field is parent_id. Thisfinder allows you to access data stored in an ‘adjacency list’ style table. Allentities matching a given parent_id are placed under the childrenattribute:

  1. // In a controller or table method.
  2. $query = $comments->find('threaded');
  3.  
  4. // Expanded default values
  5. $query = $comments->find('threaded', [
  6. 'keyField' => $comments->primaryKey(),
  7. 'parentField' => 'parent_id'
  8. ]);
  9. $results = $query->toArray();
  10.  
  11. echo count($results[0]->children);
  12. echo $results[0]->children[0]->comment;

The parentField and keyField keys can be used to define the fields thatthreading will occur on.

Tip

If you need to manage more advanced trees of data, consider usingTree instead.

Custom Finder Methods

The examples above show how to use the built-in all and list finders.However, it is possible and recommended that you implement your own findermethods. Finder methods are the ideal way to package up commonly used queries,allowing you to abstract query details into a simple to use method. Findermethods are defined by creating methods following the convention of findFoowhere Foo is the name of the finder you want to create. For example if wewanted to add a finder to our articles table for finding published articles wewould do the following:

  1. use Cake\ORM\Query;
  2. use Cake\ORM\Table;
  3.  
  4. class ArticlesTable extends Table
  5. {
  6. public function findOwnedBy(Query $query, array $options)
  7. {
  8. $user = $options['user'];
  9. return $query->where(['author_id' => $user->id]);
  10. }
  11. }
  12.  
  13. $query = $articles->find('ownedBy', ['user' => $userEntity]);

Finder methods can modify the query as required, or use the $options tocustomize the finder operation with relevant application logic. You can also‘stack’ finders, allowing you to express complex queries effortlessly. Assumingyou have both the ‘published’ and ‘recent’ finders, you could do the following:

  1. $query = $articles->find('published')->find('recent');

While all the examples so far have shown finder methods on table classes, findermethods can also be defined on Behaviors.

If you need to modify the results after they have been fetched you should usea Modifying Results with Map/Reduce function to modify the results. The map reduce featuresreplace the ‘afterFind’ callback found in previous versions of CakePHP.

Note

Passing arguments exposed in the config array,$products->find('sizes', ['large', 'medium'])can give unexpected results when chainingcustom finders. Always pass options as an associative array,$products->find('sizes', ['values' => ['large', 'medium']])

Dynamic Finders

CakePHP’s ORM provides dynamically constructed finder methods which allow you toexpress simple queries with no additional code. For example if you wanted tofind a user by username you could do:

  1. // In a controller
  2. // The following two calls are equal.
  3. $query = $this->Users->findByUsername('joebob');
  4. $query = $this->Users->findAllByUsername('joebob');

When using dynamic finders you can constrain on multiple fields:

  1. $query = $users->findAllByUsernameAndApproved('joebob', 1);

You can also create OR conditions:

  1. $query = $users->findAllByUsernameOrEmail('joebob', 'joe@example.com');

While you can use either OR or AND conditions, you cannot combine thetwo in a single dynamic finder. Other query options like contain are alsonot supported with dynamic finders. You should use Custom Finder Methods toencapsulate more complex queries. Lastly, you can also combine dynamic finderswith custom finders:

  1. $query = $users->findTrollsByUsername('bro');

The above would translate into the following:

  1. $users->find('trolls', [
  2. 'conditions' => ['username' => 'bro']
  3. ]);

Once you have a query object from a dynamic finder, you’ll need to callfirst() if you want the first result.

Note

While dynamic finders make it simple to express queries, they add a smallamount of overhead. You cannot call findBy methods from a query object.When using a finder chain the dynamic finder must be called first.

Retrieving Associated Data

When you want to grab associated data, or filter based on associated data, thereare two ways:

  • use CakePHP ORM query functions like contain() and matching()
  • use join functions like innerJoin(), leftJoin(), and rightJoin()

You should use contain() when you want to load the primary model, and itsassociated data. While contain() will let you apply additional conditions tothe loaded associations, you cannot constrain the primary model based on theassociations. For more details on the contain(), look atEager Loading Associations Via Contain.

You should use matching() when you want to restrict the primary model basedon associations. For example, you want to load all the articles that havea specific tag on them. For more details on the matching(), look atFiltering by Associated Data Via Matching And Joins.

If you prefer to use join functions, you can look atAdding Joins for more information.

Eager Loading Associations Via Contain

By default CakePHP does not load any associated data when using find().You need to ‘contain’ or eager-load each association you want loaded in yourresults.

Eager loading helps avoid many of the potential performance problemssurrounding lazy-loading in an ORM. The queries generated by eager loading canbetter leverage joins, allowing more efficient queries to be made. In CakePHPyou define eager loaded associations using the ‘contain’ method:

  1. // In a controller or table method.
  2.  
  3. // As an option to find()
  4. $query = $articles->find('all', ['contain' => ['Authors', 'Comments']]);
  5.  
  6. // As a method on the query object
  7. $query = $articles->find('all');
  8. $query->contain(['Authors', 'Comments']);

The above will load the related author and comments for each article in theresult set. You can load nested associations using nested arrays to define theassociations to be loaded:

  1. $query = $articles->find()->contain([
  2. 'Authors' => ['Addresses'], 'Comments' => ['Authors']
  3. ]);

Alternatively, you can express nested associations using the dot notation:

  1. $query = $articles->find()->contain([
  2. 'Authors.Addresses',
  3. 'Comments.Authors'
  4. ]);

You can eager load associations as deep as you like:

  1. $query = $products->find()->contain([
  2. 'Shops.Cities.Countries',
  3. 'Shops.Managers'
  4. ]);

You can select fields from all associations with multiple easy contain()statements:

  1. $query = $this->find()->select([
  2. 'Realestates.id',
  3. 'Realestates.title',
  4. 'Realestates.description'
  5. ])
  6. ->contain([
  7. 'RealestateAttributes' => [
  8. 'Attributes' => [
  9. 'fields' => [
  10. // Aliased fields in contain() must include
  11. // the model prefix to be mapped correctly.
  12. 'Attributes__name' => 'attr_name'
  13. ]
  14. ]
  15. ]
  16. ])
  17. ->contain([
  18. 'RealestateAttributes' => [
  19. 'fields' => [
  20. 'RealestateAttributes.realestate_id',
  21. 'RealestateAttributes.value'
  22. ]
  23. ]
  24. ])
  25. ->where($condition);

If you need to reset the containments on a query you can set the second argumentto true:

  1. $query = $articles->find();
  2. $query->contain(['Authors', 'Comments'], true);

Passing Conditions to Contain

When using contain() you are able to restrict the data returned by theassociations and filter them by conditions. To specify conditions, pass an anonymousfunction that receives as the first argument a query object, \Cake\ORM\Query:

  1. // In a controller or table method.
  2. $query = $articles->find()->contain('Comments', function (Query $q) {
  3. return $q
  4. ->select(['body', 'author_id'])
  5. ->where(['Comments.approved' => true]);
  6. });

This also works for pagination at the Controller level:

  1. $this->paginate['contain'] = [
  2. 'Comments' => function (Query $query) {
  3. return $query->select(['body', 'author_id'])
  4. ->where(['Comments.approved' => true]);
  5. }
  6. ];

Warning

If the results are missing association entities, make sure the foreign key columnsare selected in the query. Without the foreign keys, the ORM cannot find matching rows.

It is also possible to restrict deeply-nested associations using the dotnotation:

  1. $query = $articles->find()->contain([
  2. 'Comments',
  3. 'Authors.Profiles' => function (Query $q) {
  4. return $q->where(['Profiles.is_published' => true]);
  5. }
  6. ]);

In the above example, you’ll still get authors even if they don’t havea published profile. To only get authors with a published profile usematching(). If you have defined customfinders in your associations, you can use them inside contain():

  1. // Bring all articles, but only bring the comments that are approved and
  2. // popular.
  3. $query = $articles->find()->contain('Comments', function (Query $q) {
  4. return $q->find('approved')->find('popular');
  5. });

Note

With BelongsTo and HasOne associations only select and where clausesare valid in the contain() query. With HasMany and BelongsToMany allclauses such as order() are valid.

You can control more than just the query clauses used by contain(). If you pass an arraywith the association, you can override the foreignKey, joinType and strategy.See the ref:associations for details on the default value and options for eachassociation type.

You can pass false as the new foreignKey to disable foreign key constraints entirely.Use the queryBuilder option to customize the query when using an array:

  1. $query = $articles->find()->contain([
  2. 'Authors' => [
  3. 'foreignKey' => false,
  4. 'queryBuilder' => function (Query $q) {
  5. return $q->where(...); // Full conditions for filtering
  6. }
  7. ]
  8. ]);

If you have limited the fields you are loading with select() but also want toload fields off of contained associations, you can pass the association objectto select():

  1. // Select id & title from articles, but all fields off of Users.
  2. $query = $articles->find()
  3. ->select(['id', 'title'])
  4. ->select($articles->Users)
  5. ->contain(['Users']);

Alternatively, if you have multiple associations, you can use enableAutoFields():

  1. // Select id & title from articles, but all fields off of Users, Comments
  2. // and Tags.
  3. $query->select(['id', 'title'])
  4. ->contain(['Comments', 'Tags'])
  5. ->enableAutoFields(true)
  6. ->contain(['Users' => function(Query $q) {
  7. return $q->autoFields(true);
  8. }]);

Sorting Contained Associations

When loading HasMany and BelongsToMany associations, you can use the sortoption to sort the data in those associations:

  1. $query->contain([
  2. 'Comments' => [
  3. 'sort' => ['Comments.created' => 'DESC']
  4. ]
  5. ]);

Filtering by Associated Data Via Matching And Joins

A fairly common query case with associations is finding records ‘matching’specific associated data. For example if you have ‘Articles belongsToMany Tags’you will probably want to find Articles that have the CakePHP tag. This isextremely simple to do with the ORM in CakePHP:

  1. // In a controller or table method.
  2.  
  3. $query = $articles->find();
  4. $query->matching('Tags', function ($q) {
  5. return $q->where(['Tags.name' => 'CakePHP']);
  6. });

You can apply this strategy to HasMany associations as well. For example if‘Authors HasMany Articles’, you could find all the authors with recentlypublished articles using the following:

  1. $query = $authors->find();
  2. $query->matching('Articles', function ($q) {
  3. return $q->where(['Articles.created >=' => new DateTime('-10 days')]);
  4. });

Filtering by deep associations is surprisingly easy, and the syntax should bealready familiar to you:

  1. // In a controller or table method.
  2. $query = $products->find()->matching(
  3. 'Shops.Cities.Countries', function ($q) {
  4. return $q->where(['Countries.name' => 'Japan']);
  5. }
  6. );
  7.  
  8. // Bring unique articles that were commented by 'markstory' using passed variable
  9. // Dotted matching paths should be used over nested matching() calls
  10. $username = 'markstory';
  11. $query = $articles->find()->matching('Comments.Users', function ($q) use ($username) {
  12. return $q->where(['username' => $username]);
  13. });

Note

As this function will create an INNER JOIN, you might want to considercalling distinct on the find query as you might get duplicate rows ifyour conditions don’t exclude them already. This might be the case, forexample, when the same users comments more than once on a single article.

The data from the association that is ‘matched’ will be available on the_matchingData property of entities. If both match and contain the sameassociation, you can expect to get both the _matchingData and standardassociation properties in your results.

Using innerJoinWith

Sometimes you need to match specific associated data but without actuallyloading the matching records like matching(). You can create just theINNER JOIN that matching() uses with innerJoinWith():

  1. $query = $articles->find();
  2. $query->innerJoinWith('Tags', function ($q) {
  3. return $q->where(['Tags.name' => 'CakePHP']);
  4. });

innerJoinWith() allows you to the same parameters and dot notation:

  1. $query = $products->find()->innerJoinWith(
  2. 'Shops.Cities.Countries', function ($q) {
  3. return $q->where(['Countries.name' => 'Japan']);
  4. }
  5. );

You can combine innerJoinWith() and contain() with the same associationwhen you want to match specific records and load the associated data together.The example below matches Articles that have specific Tags and loads the same Tags:

  1. $filter = ['Tags.name' => 'CakePHP'];
  2. $query = $articles->find()
  3. ->distinct($articles)
  4. ->contain('Tags', function (Query $q) use ($filter) {
  5. return $q->where($filter);
  6. })
  7. ->innerJoinWith('Tags', function (Query $q) use ($filter) {
  8. return $q->where($filter);
  9. });

Note

If you use innerJoinWith() and want to select() fields from that association,you need to use an alias for the field:

  1. $query
  2. ->select(['country_name' => 'Countries.name'])
  3. ->innerJoinWith('Countries');

If you don’t use an alias, you will see the data in _matchingData as describedby matching() above. This is an edge case from matching() not knowing youmanually selected the field.

Warning

You should not combine innerJoinWith() and matching() with the same association.This will produce multiple INNER JOIN statements and might not create the query youexpected.

Using notMatching

The opposite of matching() is notMatching(). This function will changethe query so that it filters results that have no relation to the specifiedassociation:

  1. // In a controller or table method.
  2.  
  3. $query = $articlesTable
  4. ->find()
  5. ->notMatching('Tags', function ($q) {
  6. return $q->where(['Tags.name' => 'boring']);
  7. });

The above example will find all articles that were not tagged with the wordboring. You can apply this method to HasMany associations as well. You could,for example, find all the authors with no published articles in the last 10days:

  1. $query = $authorsTable
  2. ->find()
  3. ->notMatching('Articles', function ($q) {
  4. return $q->where(['Articles.created >=' => new \DateTime('-10 days')]);
  5. });

It is also possible to use this method for filtering out records not matchingdeep associations. For example, you could find articles that have not beencommented on by a certain user:

  1. $query = $articlesTable
  2. ->find()
  3. ->notMatching('Comments.Users', function ($q) {
  4. return $q->where(['username' => 'jose']);
  5. });

Since articles with no comments at all also satisfy the condition above, you maywant to combine matching() and notMatching() in the same query. Thefollowing example will find articles having at least one comment, but notcommented by a certain user:

  1. $query = $articlesTable
  2. ->find()
  3. ->notMatching('Comments.Users', function ($q) {
  4. return $q->where(['username' => 'jose']);
  5. })
  6. ->matching('Comments');

Note

As notMatching() will create a LEFT JOIN, you might want to considercalling distinct on the find query as you can get duplicate rowsotherwise.

Keep in mind that contrary to the matching() function, notMatching()will not add any data to the _matchingData property in the results.

Using leftJoinWith

On certain occasions you may want to calculate a result based on an association,without having to load all the records for it. For example, if you wanted toload the total number of comments an article has along with all the articledata, you can use the leftJoinWith() function:

  1. $query = $articlesTable->find();
  2. $query->select(['total_comments' => $query->func()->count('Comments.id')])
  3. ->leftJoinWith('Comments')
  4. ->group(['Articles.id'])
  5. ->enableAutoFields(true);

The results for the above query will contain the article data and thetotal_comments property for each of them.

leftJoinWith() can also be used with deeply nested associations. This isuseful, for example, for bringing the count of articles tagged with a certainword, per author:

  1. $query = $authorsTable
  2. ->find()
  3. ->select(['total_articles' => $query->func()->count('Articles.id')])
  4. ->leftJoinWith('Articles.Tags', function ($q) {
  5. return $q->where(['Tags.name' => 'awesome']);
  6. })
  7. ->group(['Authors.id'])
  8. ->enableAutoFields(true);

This function will not load any columns from the specified associations into theresult set.

Changing Fetching Strategies

As mentioned in ref:contain-conditions, you can customize the strategyused by an association in a contain().

If you look at BelongsTo and HasOne ref:associations options,the default ‘join’ strategy and ‘INNER’ joinType can be changed to‘select’:

  1. $query = $articles->find()->contain([
  2. 'Comments' => [
  3. 'strategy' => 'select',
  4. ]
  5. ]);

This can be useful when you need to add conditions that don’twork well in a join. This also makes it possible to query tablesthat are not allowed in joins such as separate databases.

Usually, you set the strategy for an association when defining itin Table::initialize(), but you can permanently change the strategy manually:

  1. $articles->Comments->setStrategy('select');

Fetching With The Subquery Strategy

As your tables grow in size, fetching associations from them can becomeslower, especially if you are querying big batches at once. A good way ofoptimizing association loading for hasMany and belongsToManyassociations is by using the subquery strategy:

  1. $query = $articles->find()->contain([
  2. 'Comments' => [
  3. 'strategy' => 'subquery',
  4. 'queryBuilder' => function ($q) {
  5. return $q->where(['Comments.approved' => true]);
  6. }
  7. ]
  8. ]);

The result will remain the same as with using the default strategy, but thiscan greatly improve the query and fetching time in some databases, inparticular it will allow to fetch big chunks of data at the same time indatabases that limit the amount of bound parameters per query, such asMicrosoft SQL Server.

Lazy Loading Associations

While CakePHP makes it easy to eager load your associations, there may be caseswhere you need to lazy-load associations. You should refer to theLazy Loading Associations and Loading Additional Associationssections for more information.

Working with Result Sets

Once a query is executed with all(), you will get an instance ofCake\ORM\ResultSet. This object offers powerful ways to manipulatethe resulting data from your queries. Like Query objects, ResultSets area Collection and you can use any collectionmethod on ResultSet objects.

Result set objects will lazily load rows from the underlying prepared statement.By default results will be buffered in memory allowing you to iterate a resultset multiple times, or cache and iterate the results. If you need work witha data set that does not fit into memory you can disable buffering on the queryto stream results:

  1. $query->disableBufferedResults();

Turning buffering off has a few caveats:

  • You will not be able to iterate a result set more than once.
  • You will also not be able to iterate & cache the results.
  • Buffering cannot be disabled for queries that eager load hasMany orbelongsToMany associations, as these association types require eagerlyloading all results so that dependent queries can be generated.

Warning

Streaming results will still allocate memory for the entire results whenusing PostgreSQL and SQL Server. This is due to limitations in PDO.

Result sets allow you to cache/serialize or JSON encode results for APIresults:

  1. // In a controller or table method.
  2. $results = $query->all();
  3.  
  4. // Serialized
  5. $serialized = serialize($results);
  6.  
  7. // Json
  8. $json = json_encode($results);

Both serializing and JSON encoding result sets work as you would expect. Theserialized data can be unserialized into a working result set. Converting toJSON respects hidden & virtual field settings on all entity objectswithin a result set.

In addition to making serialization easy, result sets are a ‘Collection’ object andsupport the same methods that collection objectsdo. For example, you can extract a list of unique tags on a collection ofarticles by running:

  1. // In a controller or table method.
  2. $query = $articles->find()->contain(['Tags']);
  3.  
  4. $reducer = function ($output, $value) {
  5. if (!in_array($value, $output)) {
  6. $output[] = $value;
  7. }
  8. return $output;
  9. };
  10.  
  11. $uniqueTags = $query->all()
  12. ->extract('tags.name')
  13. ->reduce($reducer, []);

Some other examples of the collection methods being used with result sets are:

  1. // Filter the rows by a calculated property
  2. $filtered = $results->filter(function ($row) {
  3. return $row->is_recent;
  4. });
  5.  
  6. // Create an associative array from result properties
  7. $results = $articles->find()->contain(['Authors'])->all();
  8.  
  9. $authorList = $results->combine('id', 'author.name');

The Collections chapter has more detail on what can bedone with result sets using the collections features. The Adding Calculated Fieldssection show how you can add calculated fields, or replace the result set.

Getting the First & Last Record From a ResultSet

You can use the first() and last() methods to get the respective recordsfrom a result set:

  1. $result = $articles->find('all')->all();
  2.  
  3. // Get the first and/or last result.
  4. $row = $result->first();
  5. $row = $result->last();

Getting an Arbitrary Index From a ResultSet

You can use skip() and first() to get an arbitrary record froma ResultSet:

  1. $result = $articles->find('all')->all();
  2.  
  3. // Get the 5th record
  4. $row = $result->skip(4)->first();

Checking if a Query or ResultSet is Empty

You can use the isEmpty() method on a Query or ResultSet object to see if ithas any rows in it. Calling isEmpty() on a Query object will evaluate thequery:

  1. // Check a query.
  2. $query->isEmpty();
  3.  
  4. // Check results
  5. $results = $query->all();
  6. $results->isEmpty();

Loading Additional Associations

Once you’ve created a result set, you may need to loadadditional associations. This is the perfect time to lazily eager load data. Youcan load additional associations using loadInto():

  1. $articles = $this->Articles->find()->all();
  2. $withMore = $this->Articles->loadInto($articles, ['Comments', 'Users']);

You can eager load additional data into a single entity, or a collection ofentities.

Modifying Results with Map/Reduce

More often than not, find operations require post-processing the data that isfound in the database. While entities’ getter methods can take care of most ofthe virtual field generation or special data formatting, sometimes youneed to change the data structure in a more fundamental way.

For those cases, the Query object offers the mapReduce() method, whichis a way of processing results once they are fetched from the database.

A common example of changing the data structure is grouping results togetherbased on certain conditions. For this task we can use the mapReduce()function. We need two callable functions the $mapper and the $reducer.The $mapper callable receives the current result from the database as firstargument, the iteration key as second argument and finally it receives aninstance of the MapReduce routine it is running:

  1. $mapper = function ($article, $key, $mapReduce) {
  2. $status = 'published';
  3. if ($article->isDraft() || $article->isInReview()) {
  4. $status = 'unpublished';
  5. }
  6. $mapReduce->emitIntermediate($article, $status);
  7. };

In the above example $mapper is calculating the status of an article, eitherpublished or unpublished, then it calls emitIntermediate() on theMapReduce instance. This method stores the article in the list of articleslabelled as either published or unpublished.

The next step in the map-reduce process is to consolidate the final results. Foreach status created in the mapper, the $reducer function will be called soyou can do any extra processing. This function will receive the list of articlesin a particular “bucket” as the first parameter, the name of the “bucket” itneeds to process as the second parameter, and again, as in the mapper()function, the instance of the MapReduce routine as the third parameter. Inour example, we did not have to do any extra processing, so we just emit()the final results:

  1. $reducer = function ($articles, $status, $mapReduce) {
  2. $mapReduce->emit($articles, $status);
  3. };

Finally, we can put these two functions together to do the grouping:

  1. $articlesByStatus = $articles->find()
  2. ->where(['author_id' => 1])
  3. ->mapReduce($mapper, $reducer);
  4.  
  5. foreach ($articlesByStatus as $status => $articles) {
  6. echo sprintf("There are %d %s articles", count($articles), $status);
  7. }

The above will ouput the following lines:

  1. There are 4 published articles
  2. There are 5 unpublished articles

Of course, this is a simplistic example that could actually be solved in anotherway without the help of a map-reduce process. Now, let’s take a look at anotherexample in which the reducer function will be needed to do something more thanjust emitting the results.

Calculating the most commonly mentioned words, where the articles containinformation about CakePHP, as usual we need a mapper function:

  1. $mapper = function ($article, $key, $mapReduce) {
  2. if (stripos($article['body'], 'cakephp') === false) {
  3. return;
  4. }
  5.  
  6. $words = array_map('strtolower', explode(' ', $article['body']));
  7. foreach ($words as $word) {
  8. $mapReduce->emitIntermediate($article['id'], $word);
  9. }
  10. };

It first checks for whether the “cakephp” word is in the article’s body, andthen breaks the body into individual words. Each word will create its ownbucket where each article id will be stored. Now let’s reduce our results toonly extract the count:

  1. $reducer = function ($occurrences, $word, $mapReduce) {
  2. $mapReduce->emit(count($occurrences), $word);
  3. }

Finally, we put everything together:

  1. $wordCount = $articles->find()
  2. ->where(['published' => true])
  3. ->andWhere(['published_date >=' => new DateTime('2014-01-01')])
  4. ->disableHydration()
  5. ->mapReduce($mapper, $reducer)
  6. ->toArray();

This could return a very large array if we don’t clean stop words, but it couldlook something like this:

  1. [
  2. 'cakephp' => 100,
  3. 'awesome' => 39,
  4. 'impressive' => 57,
  5. 'outstanding' => 10,
  6. 'mind-blowing' => 83
  7. ]

One last example and you will be a map-reduce expert. Imagine you havea friends table and you want to find “fake friends” in our database, orbetter said, people who do not follow each other. Let’s start with ourmapper() function:

  1. $mapper = function ($rel, $key, $mr) {
  2. $mr->emitIntermediate($rel['target_user_id'], $rel['source_user_id']);
  3. $mr->emitIntermediate(-$rel['source_user_id'], $rel['target_user_id']);
  4. };

The intermediate array will be like the following:

  1. [
  2. 1 => [2, 3, 4, 5, -3, -5],
  3. 2 => [-1],
  4. 3 => [-1, 1, 6],
  5. 4 => [-1],
  6. 5 => [-1, 1],
  7. 6 => [-3],
  8. ...
  9. ]

Positive numbers mean that a user, indicated with the first-level key, isfollowing them, and negative numbers mean that the user is followed by them.

Now it’s time to reduce it. For each call to the reducer, it will receive a listof followers per user:

  1. $reducer = function ($friends, $user, $mr) {
  2. $fakeFriends = [];
  3.  
  4. foreach ($friends as $friend) {
  5. if ($friend > 0 && !in_array(-$friend, $friends)) {
  6. $fakeFriends[] = $friend;
  7. }
  8. }
  9.  
  10. if ($fakeFriends) {
  11. $mr->emit($fakeFriends, $user);
  12. }
  13. };

And we supply our functions to a query:

  1. $fakeFriends = $friends->find()
  2. ->disableHydration()
  3. ->mapReduce($mapper, $reducer)
  4. ->toArray();

This would return an array similar to this:

  1. [
  2. 1 => [2, 4],
  3. 3 => [6]
  4. ...
  5. ]

The resulting array means, for example, that user with id 1 follows users2 and 4, but those do not follow 1 back.

Stacking Multiple Operations

Using mapReduce in a query will not execute it immediately. The operation willbe registered to be run as soon as the first result is attempted to be fetched.This allows you to keep chaining additional methods and filters to the queryeven after adding a map-reduce routine:

  1. $query = $articles->find()
  2. ->where(['published' => true])
  3. ->mapReduce($mapper, $reducer);
  4.  
  5. // At a later point in your app:
  6. $query->where(['created >=' => new DateTime('1 day ago')]);

This is particularly useful for building custom finder methods as described in theCustom Finder Methods section:

  1. public function findPublished(Query $query, array $options)
  2. {
  3. return $query->where(['published' => true]);
  4. }
  5.  
  6. public function findRecent(Query $query, array $options)
  7. {
  8. return $query->where(['created >=' => new DateTime('1 day ago')]);
  9. }
  10.  
  11. public function findCommonWords(Query $query, array $options)
  12. {
  13. // Same as in the common words example in the previous section
  14. $mapper = ...;
  15. $reducer = ...;
  16. return $query->mapReduce($mapper, $reducer);
  17. }
  18.  
  19. $commonWords = $articles
  20. ->find('commonWords')
  21. ->find('published')
  22. ->find('recent');

Moreover, it is also possible to stack more than one mapReduce operation fora single query. For example, if we wanted to have the most commonly used wordsfor articles, but then filter it to only return words that were mentioned morethan 20 times across all articles:

  1. $mapper = function ($count, $word, $mr) {
  2. if ($count > 20) {
  3. $mr->emit($count, $word);
  4. }
  5. };
  6.  
  7. $articles->find('commonWords')->mapReduce($mapper);

Removing All Stacked Map-reduce Operations

Under some circumstances you may want to modify a Query object so that nomapReduce operations are executed at all. This can be done bycalling the method with both parameters as null and the third parameter(overwrite) as true:

  1. $query->mapReduce(null, null, true);