Stephan Bruijnis .dev

The developers guide to performance

The developers guide to performance

Jan 31, 2016
Performance, Optimization, Database, Microflows, Improvements

End-users expect fast and engaging web experiences. Yet the application landscape is becoming more complex. Your Mendix application can meet these rising expectations and impress users on both desktop and mobile devices. Learn how to locate and analyse performance issues, improve application performance and gain insight in how design decisions impact your application. This post is a comprehensive reference guide for developers to optimize their applications.

Introduction #

Performance is vital to creating a successful fit-for-use application. You should always test, with specific cases and production-like data. To learn how users use your application and where performance issues might arise. However performance is very context-sensitive and this guide is not a set rules carved in stone, is more what you’d call “guidelines” than actual rules.

This reference guide is written in relatively technical / Mendix language and provides a broad overview of performance improvements. From user experience to database retrieves and data processing in microflows. It is written for developers with basic knowledge of the Mendix architecture and experience with several Mendix projects.

Mendix takes care of a lot of performance enhancing elements. For example, it takes adds indexes on relations between entities (junction tables). Furthermore, since Mendix 5 you don’t have to nest associated attributes in a data view. In other words, if you had a form in Mendix 4 (or earlier), with a data view and more than one referenced attribute, it would improve performance if you would place those referenced attributes in a associated data view. This way the association was only used once, instead of for each attribute. Since Mendix 5 this optimization is done by Mendix.

1 Locate and analyse performance #

Most performance issues are observed by the end user. In the experience of the end user the application may take too long to load, nothing happens, page is opened slowly, long running progress bars etc. Therefore the first step is to locate and analyse performance issues.

1.1 Reproduce #

The key to analyse performance is to reproduce the issues. Without reproduction it is hard to pinpoint the cause (specific UI element, microflow or data set) and test possible solutions.

1.2 Approach #

Determine focus, what is element is most likely to be the cause? The UI or the Microflow? And where does the load occur, process (CPU), memory (JVM), database?

1.3 Tools #

Tracing issues and performance can be done with tools such as:

  • Monitoring (Trends, Logs and Cache via Sprintr and compare those with running actions and scheduled events)
  • Microflow debugging and time stamps
  • Server log nodes (especially the connection bus and web services node can be helpful)
  • Browser developer tools (such as Firebug and developer mode in Chrome; record Network allows you to see XAS requests, their payload, and also a timeline with loading times)

2 Improve performance #

2.1 Forms/Pages #

2.1.1 Use simple and user specific home pages #

Make home pages lightweight and based on the user role. The first page of the application is loaded very often, a lightweight page can make a huge difference. Other frequently visited pages should also be optimized. Furthermore the user should be able to navigate to the actions associated with its role quickly. Every page an user has to navigate over to reach its targeted action or page is a waste of load on CPU, memory and database. Focus with UX on effective navigation paths.

2.1.2 Avoid complex pages #

A complex page is a page which generates many requests to the Mendix Business Server. This is often caused by many reference selectors on a page. Each reference selector triggers a retrieveList on the database. If possible use lookup pages or wizards. _Another example of a complex page is a data view which contains tab controls per topic (address, payment information) and data views and data grids for related objects (order history, wish list). Reconsider the structure of these pages, what data is required and what is optional and can be added via additional screens (lookup pages, popups, wizards etc.). A good rule of thumb is to reduce the number of components used on a page.

2.1.3 Reduce associated data, consider de-normalization #

Referred data should be avoid, it requires extra queries to retrieve this data. Whether its showing this on a data grid, using it as a filter or as sorting attribute. Consider denormalization of the domain model when confronted with associated attributes. The ‘After Commit’ event or Sub microflows is a way to keep data in synch more…(mx.

2.1.4 Avoid complex conditional visibility #

Complex conditional visibility is more complicated for the browser to load and will cause longer rendering of the page. Conditional visibility will still load all the data (even if it’s not shown). Furthermore data on tab pages is initially not loaded, until the tab page is accessed. When a tab page contains elements with conditional formatting, this data will also be part of the initial load. Furthermore, conditional visibility != security, but that’s another story. Unconfirmed: Nested tables can also cause performance issues.

2.1.5 Test custom widgets #

Custom widgets can easily add a lot of requests to the Mendix Business Server and custom widgets are not always optimized (caching issues). Test custom widgets thoroughly before using them.

2.1.6 Compress content #

Mendix bundles widgets and from 5.16 on bundle all the things, all custom widgets are bundled, so only one request is made (to the MBS) for custom widgets. Furthermore the caching configuration has been vastly improved. Yet the developer needs to consider the resources loaded by the client. First and most importantly resize/compress images. Determine what image quality is required per context and per device. Also make sure you have a well structured CSS; prevent redundant CSS styles. If you want next level, consider minimizing and compressing CSS and custom Javascript.

2.1.7 Know when to use Microflow Data Sources #

Sometimes you just need data source microflows because the target objects need to adhere to very specific criteria, or objects need to be shown under conditions that cannot be handled by XPath. However there are a few drawbacks to microflow data sources: (1) no pagination, all objects are retrieved (this compared to a data grid with database as data source, which will only retrieve the number of rows shown), (2) all id’s of the relations are retrieved (of which the retrieved object is the owner), (3) it runs after the widget is loaded or refreshed on the page. However a microflow as data source for reference selectors could be a good choice! Consider; a reference selector constrained by a XPath with an OR-operator. The performance can be improved with a microflow to determine the selectable objects. The microflow has two individually retrieves which are added to one list, this outperforms one retrieve with a more complex XPath containing an OR constraint, but more on this further on.

2.1.8 Search properties #

Add ““wait for search”” or set a ““default search”” on data grids. This will cause the page to either wait for specific input before triggering a retrieve or limit the number of objects retrieved by the page. Thus reducing the load on the database.

2.1.9 Use snippets #

Snippets are reusable interface parts. They improve maintainability; when changes need to be made it has to be done in fewer places. Furthermore it makes pages load faster!

2.1.10 Use schema on a data view or don’t (deprecated as of 7.2) #

Schemas define whether only the required attributes and associations for the object(s) are retrieved. This can sometimes improve your performance but it can also reduce performance because the objects can not be cached entirely. If you have custom widgets in your page and they need access to other attributes or associations, or if your next page contains other attributes or associations of the same object(s) you should not enable this. This is why the default value is false mx. Unconfirmed: The ““use schema”” uses the schema from the previous page

2.1.11Remove unused widgets #

Upon opening the app in the client all widgets are loaded from the server to the client in one file: widget.js. This contains all the widgets present in the widget folder of your project. As described before, all widgets are bundled into one file. Any unused widgets will unnecessary increase this file and thus loading time of your app.

2.2 Microflows #

2.2.1 Minimize retrieves #

Common use is to retrieve an object and then check if it is empty, a faster and more efficient way to do this is simply check if the association exists (via $object/association_name!=empty), then act accordingly.

2.2.2 Avoid nested loops with database actions #

Nested loops used to match specific attributes can often be replaced by a combination of list operations such as intersect, find, filter and others… to achieve the same matching mechanisms more. Secondly, nested loops with database actions implies a multiplication of the actions performed in the specific context.

2.2.3 Separate retrieves for multiple aggregate actions #

When a database retrieve activity is only used in combination with one list aggregate activity, the platform can automatically merge these two activities into a single action. This would execute a single aggregate query on the database. However if you re-use the same list variable for multiple list aggregates this no longer applies. Apparently the platform only merges the activities together as long as a list is only used for 1 single aggregate, as soon as you start reusing the list variable anywhere you could end up with memory issues. Just create two separate retrieve activities instead, you’ll prevent memory errors and improve the process speed drastically mx.

2.2.4 Don’t refresh and commit every action #

Make it very specific when the client needs a refresh and when something needs to be committed in the database. See also commit lists.

2.2.5 Commit lists #

The commit list functionality commits multiple objects in a batch to the database, this process is far more efficient than commits executed in a loop. Even if the objects are newly created consider adding them to a list and commit them as a batch. Greatly reducing the load on the database.

2.2.6 Clear lists #

The change list action “clear” is often ignored by developers, unfairly. The clear list action empties the objects from the targeted list, consequently the objects in the list are unavailable further on in the microflow. The load on memory can benefit greatly from the clear list action, specially in large memory consuming microflows. However clearing a list also takes up resources from the server. To understand the dynamics of memory consumption we need to look at two memory management systems on top of each other: “the Mendix Platform internal memory, which stores all non-persistent objects and all changed objects. And the regular Java Garbage collector which cleans up all objects as soon as the Platform is done with them. The built in Mendix garbage collector will not execute until the end of the microflow. That means that when you change an object, those changes will sit somewhere in memory as well. For the Mendix garbage collector it makes no difference since the Mx Garbage Collector will not be executed until the end of the microflow. There is also the Java Garbage Collector. This process runs whenever needed and cleans up all objects that are no longer referenced from memory. If you remove items from the list it will allow the JGC to collect those objects and free memory mx. Make sure that you commit the changes first, because that will ensure that these objects are added to the database transaction and kept in database memory till they are really stored in the database (end of transaction/microflow is reached).

2.2.6 Execute microflows asynchronous when possible #

Flows handling large amounts of data can best be processed by flows triggered by the system (either scheduled events or asynchronous in queue background). But some large flows need to be linked to user actions, it’s best to link these to specific user invocation (e.g. buttons). This will emphasize the action and with a progress bar visualize a larger processing flow. Furthermore asynchronous flows will prevent the client from resending the request too quickly to the server (and triggering the flow again if concurrent execution is not disabled) or even show a connection dropped error. Instead the server is polled every 10 seconds to check if the microflow execution is completed. Unconfirmed: It is also better to set on change microflows to asynchronous.

2.3 Database #

2.3.1 Use the limit and offset #

Retrieve a given number of objects (limit) starting at a given index (offset). By limiting the number of objects retrieved the load on database is reduced. With a custom loop (exclusive split and merge) the entire set can still be processed as a batch. When you limit the number of objects per batch and create a loop for multiple batches make sure that you place the create list action (for the objects to commit) inside the correct loop and not before the loop. This will prevent the list to grow with each iteration, because it is initialized in each iteration mx.

2.3.2 Indexes: tread lightly #

Indexes improve the speed of retrieving objects, but slow down the process of changing and deleting objects. If the indexed attributes are used in a search field, XPath constraint, grid or in a WHERE clause of an OQL query it can improve performance. However, search fields of which the Comparison property has value ‘Contains’ do not take advantage of the improved performance. Changing and deleting objects of an entity with indexes takes longer, because the index needs to be updated in addition to the actual data. Therefore, for attributes which are rarely used as criteria in a search or query, only create an index if the increase in retrieval performance justifies the decrease in update performance mx. Most useful when read (search, sort, order, retrieve) actions outnumber write (create, change, delete) actions. A few other points when using indexes:

  • Only one index per queried table can be used
  • An index is used when the first column of an index is used in the query
  • The most specific index will be used
  • Only on attributes with high data variance (thus, booleans are less suited)
  • Effective on larger tables

2.3.3 Inheritance versus 1-1 #

Both inheritance and 1-1 associations have their advantages and disadvantages. Based on your needs you need to decide per situation what is best for that entity. Never use inheritance for entities with:

  • A high number of transactions on the different sub entities (As a high we consider multiple changes or creates per second)
  • Only a handful common attributes. If you feel that it isn’t worth creating associated objects for the information, it isn’t worth inheriting either Never use 1-1 association for entities:
  • That always require the information from the associated objects, and users intensively search and sort on the associated attributes mx Note: There is no guarantee in the order in which event handlers are triggered when using inheritance.

2.3.4 Avoid large XPaths #

XPath constraints allow you to Constrain data based on the association with other tables. In other words, think of an XPath constraint as a join between two tables. In order to write XPath Constraints effectively, you should typically take the shortest possible path to the desired data to eliminate unnecessary joins mx. Consider adding a direct association between entities to replace an indirect association over multiple entities and keep this in synch with e.g. after commit events. The 3 ways to improve performance blog suggests rewriting XPaths, however this also changes the semantics of the XPath (see earlier posts)

2.3.5 Avoid a large combination of rules #

Specific user rules defined in the access rules tab of an entity have the tendency to become large XPath constraints. This can be the cause for performance issues with some user (roles) while others experience no difficulties. Rethink the app user roles and combine user roles if possible.

2.3.6 relation/entity/id = $object versus relation = $object #

Retrieves to an associated object are often made based on the id or $object (e.g. [relation_name/entity/id = $object] or [relation_name/entity/id = 'ID_124123512341']). This notation is not recommended. This is because its execution is inefficient and results in a lower performance due to manner in which it is processed by the database mx. Rather use [relation_name = $object] or [relation_name = 'ID_124123512341'].

2.3.7 Archive data #

Large tables will slow down queries, thus archiving, deleting or other wise removing unused data and associations will speed up performance.

2.3.8 Event handlers #

Are always triggered (unless defined otherwise), and will slow down the batch processes (commits and deletes). Also see Inheritance versus 1-1.

2.3.9 Delete behaviour #

This one is tricky, a complex structure of delete behaviour can definitely slow down the delete batch process. On the other hand it does enforce security (access rules) and data quality. You could consider the use of microflows to replace the delete behaviour.

2.3.10 Pre-calculated attributes #

Calculated or virtual attributes are nearly always executed, creating an overhead cost and thus causing a load on processing power. The solution: de-normalize calculated attributes. Use pre-calculated attributes instead, on each change of value the new value can be calculated. This shifts the processing load from viewing values to changing values. When calculated attributes are absolutely necessary make sure that the executed microflow is lightweight, i.e. no retrieves of other objects. Or put them on an 1-1 associated entity (only retrieve them when needed).

2.3.11 Consider non-persistent entities #

Non-persistent object only exist in memory, and the database is not accessed. Non-persistable entities shift the load from database to memory. Non-persistent objects are cleared from memory by the garbage collector when they are not used any more.

2.3.12 Optimize the XPath constraints #

XPath constraints allow you to constrain data based on the association with other tables. In other words, think of an XPath constraint as a join between two tables. In order to write XPath Constraints effectively, you should typically take the shortest possible path to the desired data to eliminate unnecessary joins mx. Optimize the XPath constraints by following these pointers:

  • Attributes above associations: Joins (associations) require more resources than constraints on the retrieved object
  • Most discriminating constraint first: By putting the most restrictive constraints first the amount of data retrieved is limited The XPath constraints are executed in the order of notation. It matters which XPath constraint comes first! Yet these pointers are just like these guidelines very dependent on the context. Constraints with associations can be very discriminating and thus more effective, or some constraints benefit from indices making them less costly to execute. Avoid these specific XPath statements when possible:
  • OR: Can often be replaced by two separate retrieves and union the results (more efficient)
  • not(),: Can cause performance issues on large tables. not() retrieves data and inverts the result (it retrieves all data in a sub query)
  • contains(), start-with(), ends-with (): The performance depends on the SQL-engine, the level of index usage by the query and execution plan of the database. Indexes can be used on various levels more.
    • Clustered index seek: very fast and efficient. All data is physically ordered according to specified column(s) and the SQL server can pull the data sequentially
    • Index seek: very fast and efficient. The SQL server knows quite well where the data is and can go directly to it and seek out the needed rows. The data isn’t ordered in the database by the fields in the index so it’s likely not pulling the data sequentially like in clustered index seek.
    • Index scan: can be slow and costly. It scans through the index instead of scanning the physical table(s). It doesn’t know exactly where the data is, so it may scan the whole index or a partial range of the index to find it’s data.
    • Clustered index scan: slow and costly. The index is not much more than a table with the data physically ordered by specified columns. SQL has to physically search through every single row in the clustered index, just like a table scan
    • Table scan: slow and inefficient. SQL has to physically look at every single row in the table. In short, start-with() transforms into like%, ends-with into %like and contains into %like%. Usage of the like-statement (or ilike on PostgreSQL db) doesn’t mean no indexes, it can still use the index on a certain level. The like% can still use _index seek _and %like and %like% can still usage the index scan.

2.3.13 Limit the number of user roles per user #

For each user role (project security) assigned to an user in the application a CASE expression statement is added to the database query. Each CASE is an expression that returns a boolean result. Each additional user role (per user) implies longer queries for every database action for that user.

2.4 XML and Web Services #

2.4.1 XML options which influence performance #

The Import XML and Export XML actions have the following settings and when set to true (default false) they could slow down performance:

  • Validate against schema: Whether the import action should validate the incoming XML against the schema (XSD).
  • Use subtransactions for microflows: Specifies whether separate (nested) database transactions should be used when obtaining objects via microflow.

2.4.2 Asynchronous processing #

A web service which receives data (either published or consumed) can be processed asynchronous to improve performance. Store the data in an entity without any transformation, and trigger the processing of the data via a scheduled event or asynch in queue (See also Execute microflows asynchronous when possible).

2.5 Hardware #

2.5.1 Users #

The hardware used by the end user plays a role in fast and engaging web experience. In some cases the hardware or just the browser of the user needs an update.

2.5.2 App engines or hardware #

New functionality, users, entities, services, etc. are often added to the App, without considering the additional resources required to process these additions. Applications hosted in the Mendix cloud can be upgraded with additional ‘App Engines’. When hosting on premise; add additional hardware or extra virtual resources.

3 Final remarks #

This reference guide is a combination of Mendix documentation, other blogs, personal experiences and things learned in the Mendix performance workshop (I recommend this workshop to anyone interested in learning more about performance). Did I miss something? Let me know! A pointer to improve performance, or a citation/reference, etc. leave a comment or email me.

3.0.1 Not covered #

This reference guide for developers does not cover:

  • Performance issues caused by the platform, e.g. memory leaks. The platform is thoroughly tested by Mendix and issues found are fixed in following releases.