- Pestle 1.1.1 Released
- Pestle 1.1.2 Released
- Magento 2 Setup Migration Scripts
- Pestle 1.2.1 Released
- Sending Text Messages with PHP, pestle, and Nexmo
- Pestle 1.3 and AbstractModel UI Generation
- Pestle 1.4.1 and the Merits of Inheritance
- Pestle 1.4.4 Released
- Pestle Docs Done (for now)
- Pestle 1.4.2 Now Available
- Installing Pestle via. Homebrew
- Pestle 1.5.2 Released
Today we’re going to talk about pestle’s new magento2:generate:schema-upgrade
command. To do that, we need to talk a little bit about Magento 1, as well as “migrations” in information systems more generally.
Information Systems
Most e-commerce, content management, and CRM software can be broadly classified as information systems.
An information system (IS) is any organized system for the collection, organization, storage and communication of information. More specifically, it is the study of complementary networks that people and organizations use to collect, filter, process, create and distribute data.
All of these systems have some form of data-persistence. Most of them primarily use an ANSI-SQL database to store structured data (MySQL, PostgreSQL, etc.). Newer systems might incorporate a modern key/value object store (Redis, MongoDB)
Regardless of how these information systems store their data, these information systems are software. This means the creators/maintainers of these information systems will inevitably need to update their software. All successful software needs to fix bugs and improve features if it wants to remain successful.
When it comes to data persistence, this presents a few challenges:
- How do we make updates to our user’s database schemas?
- How do we ensure the default data in the data-persistence layer is present?
- How do we do this in a way that creates a record of how/when these changes happened?
- How do we tie these changes to a particular version of our software?
- How do we do this in a way that either succeeds, or clearly falls back to the previous schema/data version?
Broadly speaking, in the realm of “MVC software systems used to build information systems software”, the world has settled on using schema/data migrations to achieve this. Migrations
- Contain instructions for upgrading, and sometimes downgrading, a database schema
-
Contain instructions for adding, removing, or changing, the default data needed to make a system run (i.e. NOT the data the system is managing)
-
Are versioned
-
Can be run in such a way that the default schema/data at any version of an information system’s life can be recreated
While the concept surely predates this, the first time I encountered schema and data migrations in my career was via Ruby on Rails. The first PHP system I worked with that had a formalized migration system was Magento 1.
Magento 1 Setup Resource Scripts
Magento 1’s migration system was usually referred to as the Setup Resource system — named for the base PHP class that controlled the system.
When properly configured, each Magento module could have a data
and sql
folder. Each folder could contain scripts named via Magento module version numbers
$ ls -1 app/code/core/Mage/Catalog/data/catalog_setup/
data-install-1.6.0.0.php
data-upgrade-1.6.0.0.12-1.6.0.0.13.php
data-upgrade-1.6.0.0.13-1.6.0.0.14.php
data-upgrade-1.6.0.0.4-1.6.0.0.5.php
data-upgrade-1.6.0.0.8-1.6.0.0.9.php
$ ls -1 app/code/core/Mage/Catalog/sql/catalog_setup/
install-1.6.0.0.php
mysql4-data-upgrade-0.7.57-0.7.58.php
mysql4-data-upgrade-0.7.63-0.7.64.php
mysql4-data-upgrade-1.4.0.0.28-1.4.0.0.29.php
/* ... */
upgrade-1.6.0.0.7-1.6.0.0.8.php
upgrade-1.6.0.0.9-1.6.0.0.10.php
The intricacies of these naming conventions are beyond the scope of this article, but encoded in the file names are instructions for when (based on module version) these scripts should run. Inside each script are instructions for updating the database.
#File: app/code/core/Mage/Catalog/data/catalog_setup/data-upgrade-1.6.0.0.13-1.6.0.0.14.php
/*...*/
$installer->startSetup();
$entityTypeId = $installer->getEntityTypeId(Mage_Catalog_Model_Category::ENTITY);
$attributeId = $installer->getAttributeId($entityTypeId, 'filter_price_range');
$attributeTableOld = $installer->getAttributeTable($entityTypeId, $attributeId);
$installer->updateAttribute($entityTypeId, $attributeId, 'backend_type', 'decimal');
/*...*/
Via this mechanism, Magento was able to alter the SQL database such that it matched what the module developers needed it to be.
Problems with MySQL, Problems with Magento
Magento 1’s migration system, while not perfect, did an OK job over the years. Both the core team, and Magento’s ecosystem of module developers were able to use it to distribute schema and data changes to their users. However, there were a few places it fell down. Sometimes this was due to design issues. Other times this was due to the limitations of Magento’s primary database — MySQL.
While PHP, and Magento 1, are capable of talking to most relational database management systems, most information systems built in PHP are biased towards MySQL. With regards to migrations, MySQL throws a sticky wicket when presented with the question:
How do we do this in a way that either succeeds, or clearly falls back to the previous schema/data version?
A good migration system will wrap their migration code in a single database transaction. If a problem occurs while the script is running, the transaction rolls the database back to the previous version.
Unfortunately, — not every MySQL statement has transactional support. In particular
these include data definition language (DDL) statements, such as those that create or drop databases, those that create, drop, or alter tables or stored routines.
In other words, one of the important things a migration system needs to do is not possible with MySQL databases. If you wanted transaction support you’d need to implement it yourself in PHP code — a non-trivial task at best.
Magento 1’s Setup Resource scripts do not have transaction support. — The why of this is probably lost to history, but it’s safe to say that MySQL’s lack of support for DDL transactions (and the relative inexperience of the still-pretty-smart original Magento 1 core engineers) played a role.
While not great, this is a problem anyone (Laravel, Rails, etc.) targeting MySQL needs to live with. What made things extra frustrating for Magento 1 developers was the fact that
- Setup Resource scripts ran automatically in the background of an HTTP page load
- Magento Connect, the official package manager for Magento 1, was incredibly buggy and routinely left systems half updated (while reporting no errors)
- There was no way to manually trigger the setup resource scripts
While more experienced developers would know to use something like the n98-magerun sys:setup:incremental
command to run their scripts incrementally, the average Magento user was left at the mercy of an opaque, buggy, system. The problems of half-complete updates got so bad that Magento needed to introduce a database repair tool that tries to “patch up” databases after a bad migration run.
All this has left most experienced Magento developers curious as to how Magento 2 would tackle these problems.
Magento 2 Setup Install/Upgrade Classes
Whether you call it a replacement, or a refactoring so thorough the system is unrecognizable, the old Magento 1 setup resource system is no longer present in Magento 2. In its place, Magento gives each module a set of Install and Upgrade classes. When a user runs the
php bin/magento setup:upgrade
command, Magento will look for these specifically named classes in each module. If this is the first time Magento sees a module, Magento will instantiate objects from the Packagename\Modulename\Setup\InstallSchema
and Packagename\Modulename\Setup\InstallData
classes, and call their install methods.
If Magento notices the module’s version number has changed, Magento will instantiate objects from the Packagename\Modulename\Setup\UpgradeSchema
and Packagename\Modulename\Setup\UpgradeData
classes and call their upgrade
methods.
The first thing you’ll notice is Magento 2, similar to Magento 1, still separates out module installation from module upgrades. However, Magento 2 does not have any built-in support for module versioning. Regardless of which version of the module you’re upgrading to, Magento will always instantiate objects from the Packagename\Modulename\Setup\UpgradeSchema
and Packagename\Modulename\Setup\UpgradeData
class.
The result? Each module programmer needs to do version sniffing themselves. You can see this in the Magento core in various places.
#File: vendor/magento/module-quote/Setup/UpgradeSchema.php
public function upgrade(SchemaSetupInterface $setup, ModuleContextInterface $context)
{
$setup->startSetup();
if (version_compare($context->getVersion(), '2.0.1', '<')) {
$setup->getConnection(self::$connectionName)->addIndex(
$setup->getTable('quote_id_mask', self::$connectionName),
$setup->getIdxName('quote_id_mask', ['masked_id'], '', self::$connectionName),
['masked_id']
);
}
if (version_compare($context->getVersion(), '2.0.2', '<')) {
$setup->getConnection(self::$connectionName)->changeColumn(
$setup->getTable('quote_address', self::$connectionName),
'street',
'street',
[
'type' => \Magento\Framework\DB\Ddl\Table::TYPE_TEXT,
'length' => 255,
'comment' => 'Street'
]
);
}
//drop foreign key for single DB case
if (version_compare($context->getVersion(), '2.0.3', '<')
&& $setup->tableExists($setup->getTable('quote_item'))
) {
$setup->getConnection()->dropForeignKey(
$setup->getTable('quote_item'),
$setup->getFkName('quote_item', 'product_id', 'catalog_product_entity', 'entity_id')
);
}
$setup->endSetup();
}
This is — problematic? — for a number of reasons. While it works, as Magento 2 continues to release new versions, these upgrade classes will quickly grown unwieldy in size. Also, there’s a not-insignificant chance a programmer will inadvertently change an older version_compare
if
block while editing the file for the latest version. Also — while useful, PHP’s version_compare
function can be a little ambiguous as to how it works, forcing the module developer to think about something that should be automatic. Finally, even if you copy Magento’s pattern, it’s possible to introduce a code branch that indicates a newer version but will still run if the code’s run against an older version. Consider this code
if (version_compare($context->getVersion(), '2.2.0', '<'))
) {
//...
}
If the user is upgrading the module to version 2.1.1
, but for some reason the 2.2.0
branch is present in code, the branch will still run. Again, developers are forced to think about something the previous system handled automatically.
I’m hesitant to speculate as to why Magento 2 created a seemingly inferior migration system for Magento 2. However, lacking other evidence, this sure looks like a feature implemented by a team that wasn’t taking the long view on their platform, and by a developer who had little experience writing these sorts of systems and lacked a supportive and candid peer review. It’s particularly incongruous when you consider it’s led to giant if/then
blocks from a team that so heavily trumpeted the need for massively abstract class-based-oop systems.
Using Pestle to Create Upgrade Classes
Regardless of what we may think of it, this is the system we have in Magento 2. This brings us to the magento2:generate:schema-upgrade
command. This command will
- Populate a module with an
UpdateSchema
andUpgradeData
class - Provide an optional default implementation for both classes that uses versioned scripts.
To use this command, just run the following (replacing Pulsestorm_Helloworld
with your own module name)
$ pestle.phar magento2:generate:schema-upgrade
Module Name? (Pulsestorm_Helloworld)] Pulsestorm_Helloworld
New Module Version? (0.0.2)] 0.0.2
When complete, you’ll have three new classes
app/code/Pulsestorm/Helloworld/Setup/UpgradeData.php
app/code/Pulsestorm/Helloworld/Setup/UpgradeSchema.php
app/code/Pulsestorm/Helloworld/Setup/Scripts.php
two new setup scripts (in new top-level module folders)
app/code/Pulsestorm/Helloworld/upgrade_scripts/data/0.0.2.php
app/code/Pulsestorm/Helloworld/upgrade_scripts/schema/0.0.2.php
and pestle will increment the version in your module.xml
file to match the version you specified. The class names and paths above are based on our passing the command Pulsestorm_Helloworld
— your classes will be named based on the module name you pass to the command.
Pestle’s Classes
The first two classes pestle creates
app/code/Pulsestorm/Helloworld/Setup/UpgradeData.php
app/code/Pulsestorm/Helloworld/Setup/UpgradeSchema.php
are mostly standard Magento 2 UpgradeData
and UpgradeSchema
classes. We say mostly standard, because they do contain a default implementation
public function __construct(
\Pulsestorm\Helloworld\Setup\Scripts $scriptHelper
)
{
$this->scriptHelper = $scriptHelper;
}
/**
* {@inheritdoc}
*/
public function upgrade(
SchemaSetupInterface $setup,
ModuleContextInterface $context
)
{
$setup->startSetup();
$this->scriptHelper->run($setup, $context, 'schema');
$setup->endSetup();
}
This default implementation calls the generated Pulsestorm\Helloworld\Setup\Scripts
object’s run
method. The Pulsestorm\Helloworld\Setup\Scripts
object implements a simple, traditional setup resource script system. With the above in place, when Magento first sees version 0.0.2
of this module, the Pulsestorm\Helloworld\Setup\Scripts
class will include
in the 0.0.2.php
include files in the upgrade_scripts/data
and upgrade_scripts/schema
folders — as well as any other versioned files that exist between the old module version and the new, current module version.
It’s not necessary to use the Pulsestorm\Helloworld\Setup\Scripts
helper (and future versions of pestle will give you options to omit it), but for developers who plan on releasing multiple versions of their modules, separating each upgrade out into its own include file seems like a saner approach than what Magento’s doing in their core UpgradeSchema
classes.
As always, if you run into problems using this command, or have ideas on how it could be better, we’re active and responsive to all GitHub issues.