Why database should be taught before programming in universities?

Database Programming
Learn Database before Coding

Often students from the initial semester ask me how do we store our data in our programming projects? When students join university to learn about computer science and technology they are usually taught programming first in courses like introduction to programming. As part of the coursework, students are required to work on a project. The majority of the projects, in fact almost all projects involve data handling and that data needs to be stored somewhere, usually in databases.

Problems Students Face

As a novice students don’t know how to store data. One option is to store data in plain text files if filing is taught to them but in that case, their project becomes too complex for them. In my opinion, file format is an advanced topic for students that have just started learning how to program. So, students get stuck on where and how to store data. They create variables and arrays to store data in memory but that is not very useful until they have the option to store their data somewhere permanently that they can retrieve later. Otherwise, every time they run their project they have to feed data from the beginning.

Teach Database Before Programming

If universities modify their courses and add database in the first semester and replace programming courses with it then it would be easier for students to get started in computer science degree. Introduction to databases is a relatively easier course than programming and students will know what a database is, how to store data in the database, and how to retrieve it later using SQL. Then in the next semester if they do a programming course then it will require only one lecture to teach them how to access a database from your code and how to store and retrieve data. That will make their projects more valuable and make more sense to them and they can take it to an advanced level in forthcoming courses.

Your Take?

What is your opinion? Please, let me know in the comments.

Click here to read more about Databases.

Laravel: Specified key was too long error on migration

When you install a new Laravel project with ‘laravel new’ and run the migration that comes with it you might get the following error:

#php artisan migrate
Migration table created successfully.


 [Illuminate\Database\QueryException]
 SQLSTATE[42000]: Syntax error or access violation: 1071 Specified key was too long; max key length is 767 bytes (SQL: alter table `users` add unique `users_email_unique`(`email`))

[PDOException]
 SQLSTATE[42000]: Syntax error or access violation: 1071 Specified key was too long; max key length is 767 bytes

To solve this error edit your app/Providers/AppServiceProvider.php file.

Add the namespace:

use Illuminate\Support\Facades\Schema;

and then add the following line in boot() method of your AppServiceProvider class.

use Illuminate\Support\Facades\Schema;

class AppServiceProvider extends ServiceProvider
{
 /**
 * Bootstrap any application services.
 *
 * @return void
 */
 public function boot()
 {
   Schema::defaultStringLength(191);
 } 
}

This should solve your problem. Just delete all the tables and rerun the migration.

Click here to read more about Laravel.

Setup Laravel with Docker containers

Docker

Docker has captured my attention lately and has been growing exponentially for last few years. Docker has revolutionized the virtualization space and has given DevOps engineers and developers a new set of tools that can ease their development as well as infrastructure resource utilization. Mostly Node.js apps have been deployed with Docker but it is not limited to only that. If you’ve been developing apps for web using PHP and Laravel framework then you are lucky that community has developed some great tools to utilize docker in their development workflow. If you are new to Docker and want to learn more about Docker then visit their website docker.com specially the section What is Docker?

Laradock

Laradock does the hard part and facilitate developers to quickly and easily setup Laravel development environment in seconds. Follow the these steps to get started.

Setup fresh environment then install Laravel inside docker

  • Create project directory. e.g. myproject
  • Create another directory within myproject directory for source that you will use commit to git repo. eg. app
  • Clone the laradock repo inside myproject folder, this will create laradock folder
git clone https://github.com/laradock/laradock.git
  • Edit the docker-compose.yml file to map to your project directory once you have it (eg. – ../app:/var/www)
  • Run the following command
docker-compose up -d nginx mysql redis beanstalkd
  • Enter the workspace
docker-compose exec workspace bash
  • cd to /var
  • Install Laravel
composer create-project laravel/laravel www
  • Open your .env file and set the following
DB_HOST=mysql
REDIS_HOST=redis
QUEUE_HOST=beanstalkd

Now open your browser and visit http://localhost

You will see running Laravel app. That was cool!

You can also connect to mysql using these settings.

Host: localhost
User: root
Password: root

If you want to add more services see the official laradock docs. It support tons of other services like memcache, pgsql etc.

Daily Use

Following are some docker related commands that you want to use on daily basis once your development environment is setup.

  • To bring up the servers to run application
docker-compose up -d nginx mysql
  • Enter workspace container to run commands like artisan, composer, phpunit, gulp etc.
docker-compose exec workspace bash

or on Windows PowerShell

docker exec -it {containerid} bash
  • List current running containers
docker ps
  • List current project containers
docker-compose ps
  • Close all running containers
docker-compose stop
  • Delete all existing containers
docker-compose down
  • View the log files
docker logs {container-name}

Insert new rows as a result from other SQL query

There were numerous times where I needed to insert records in table as a result of another query, or copy data from one table to another. And every time I forgot the syntax of insert…select query. So I thought I should write it somewhere and blog is the best place for this 🙂

A very basic example of insert…select query is:

INSERT INTO tbl_temp2 (fld_id)
  SELECT tbl_temp1.fld_order_id
  FROM tbl_temp1 WHERE tbl_temp1.fld_order_id > 100;

Use this as a reference and build your own query. You can use joins as well. If you use MySQL and want further details see http://dev.mysql.com/doc/refman/5.1/en/insert-select.html.

Xdebug in WebMatrix

Xdebug is a very popular PHP extension that helps with debugging and profiling of PHP scripts by providing a lot of valuable debug information. Microsoft WebMatrix is a development tool for building web applications. When WebMatrix is used to build or modify a PHP-based web application the debugging tasks can be greatly simplified if Xdebug extension is used. This post explains how to install and use Xdebug extension with WebMatrix.

Step 1: Enable PHP in WebMatrix from the site “Settings” page:

Note that if you installed any of the PHP applications from the Application Gallery then PHP will be automatically enabled.

Step 2: Download the appropriate build of Xdebug extension from downloads page. If your site uses PHP 5.2 then download “5.2 VC6 Non-thread safe (32 bit)”. If your site uses PHP 5.3 then download “5.3 VC9 Non-thread safe (32 bit)”. Use 32 bit build even if your Windows OS is 64 bit.

Step 3: Install the extension by copying the downloaded file to the following locations:

  • For PHP 5.2 on Windows 64 bit:
    C:\Program Files (x86)\IIS Express\PHP\v5.2\ext\
  • For PHP 5.2 on Windows 32 bit:
    C:\Program Files\IIS Express\PHP\v5.2\ext\
  • For PHP 5.3 on Windows 64 bit:
    C:\Program Files (x86)\IIS Express\PHP\v5.3\ext\
  • For PHP 5.3 on Windows 32 bit:
    C:\Program Files\IIS Express\PHP\v5.3\ext\

Step 4: Open the php.ini file located in the PHP installation folder, e.g.
C:\Program Files\IIS Express\PHP\v5.2\php.ini and append the following at the end (make sure that the absolute path is correct for your version of PHP and Windows.):

[xdebug]
zend_extension = C:\Program Files\iis express\PHP\v5.2\ext\php_xdebug-2.1.0-5.2-vc6-nts.dll

Step 5: Configure PHP to display errors by changing these PHP settings in php.ini file:

display_errors = On
error_reporting = E_ALL & ~E_NOTICE

Step 6: Test that extension is enabled and works by either calling a phpinfo() function from a script or by running a buggy script:

The Xdebug extension provides a lot of useful features that help with debugging of PHP applications. You can learn more about them from the Xdebug documentation. For example you can use it to profile a PHP application. Just change the php.ini file as shown below and then make a request to a PHP script:

[xdebug]
zend_extension = C:\Program Files\iis express\PHP\v5.2\ext\php_xdebug-2.1.0-5.2-vc6-nts.dll
xdebug.profiler_enable = On
xdebug.profiler_output_dir = C:\Windows\temp

The profile log will be saved into the specified directory and will have a name cachegrind.out.* Use WinCacheGrind to open and analyze it:

MySQL Installation and Configuration

Since I have been involved with MySQL development for last several years, so I constantly look for good resources related to MySQL. I found a very good article on this at IIS site (http://learn.iis.net/page.aspx/610/walkthrough—set-up-mysql-51-for-php-applications/). I thought I should refer that to my blog. I am providing whole article as it is here.

Overview

This article provides a basic step by step guide on how to install and configure MySQL on the Windows Operating System. For more detailed instructions about installing and configuring MySQL on Windows refer to the official MySQL documentation.

Downloading and Installing MySQL

The MySQL binaries and installer can be downloaded from the official MySQL site. The instructions in this article are based on MySQL version 5.1 Community Edition installed with the Windows MSI installer.

Run the installer and choose the installation option. For a majority of the cases, the typical installation is sufficient:

However, if you want to control which components get installed or if you want to use a non-default installation path then choose the “Custom” option.

When the installation is complete, make sure to check the box to “Configure the MySQL Server now”. This will launch the “MySQL Server Instance Configuration Wizard” that will guide you through the configuration process for the MySQL instance.

Configuring MySQL instance

Follow these steps in the “MySQL Server Instance Configuration Wizard” to optimize the MySQL configuration for the kind of tasks you expect it to perform.

On the first page of the wizard choose “Detailed Configuration”:

On the next page choose the server type option:

Choose the “Database Usage” option:

The Database usage options control what kind of database storage engine is used on the server:

  • MyISAM – Optimized for high performance SELECT operations. It has low overhead in terms of memory usage and disk utilization, but at the cost of not supporting transactions
  • InnoDB – Provides fully ACID transactional capabilities, but at the cost of more aggressive usage of disk space and memory

For an in-depth comparison of these database engines, refer to MySQL Storage Engine Architecture. As a general recommendation – if the web applications on your server require multi-statement transactions, advanced isolation levels and row-level locking, foreign key constraints, or otherwise have a requirement for ACID features — use InnoDB. Otherwise, use MyISAM.

Next choose the number of concurrent connections to the server:

On the next page choose networking options :

If you have mysql and web server on the same machine you may consider not enabling TCP/IP networking and instead use named pipes. Note though that some PHP applications may require TCP connection to MySQL. Refer to the application’s documentation to confirm if it supports named pipes connection to MySQL.

Choose the default charset to use when creating new databases:

Next ensure that MySQL will be configured as a Windows Service:

Optionally, you can add the MySQL Bin directory to the Windows PATH environment variable. That will make it easier to launch MySQL tools from the command line.

Finally provide the password for the database administrative account, which in called “root” in MySQL. Make sure that you leave the “Create an Anonymous Account” checkbox cleared:

On the next page click “Execute” to apply all the configuration settings and to start the MySQL service:

Now you can logon to MySQL by opening a command line window and typing:

mysql -u root -p
Enter password: ******If MySQL was configured correctly then the MySQL prompt will be shown:

Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 3
Server Version 5.1.32-community MySQL Community Server (GPL)
Type ‘help;’ or ‘\h’ for help. Type ‘\c’ to clear the buffer.
mysql>

MySQL Query Cache

MySQL query cache is good to cache select queries and their results and improve some performance of your database applications. So I thought I should write a quick tutorial to quickly set up the query cache without going into the details. So here what you have to do to enable query cache on MySQL.

If you want to enable query cache without restarting the MySQL server then just run the following command on MySQL prompt/client:

set global query_cache_size=67108864;


This will set MySQL query cache size to around 64MB and also enable the query cache. You can set it to any value you desire. You can then check the status later by running following commands on MySQL prompt/client:

SHOW VARIABLES LIKE ‘%query_cache%’;

SHOW STATUS LIKE ‘%qcache%’;


These commands will show some useful information about the query cache like if query cache is enabled, how much memory is being used, how many queries are currently in cache, cache hit rate etc.

This is really a quick way to enable query cache. For details see MySQL documentation http://dev.mysql.com/tech-resources/articles/mysql-query-cache.html

Multiple MySQL on single host

Sometimes we need to run multiple MySQL servers on single machine. That is mostly required in testing environments to test different aspects with different configurations. In this way one can test server without affecting others. So, if you want to run multiple MySQL you can use MySQL Sandbox which eases the whole process of installing and configuring the server. Here how will you do it.

First of all you need to install MySQL Sandbox. You can download it from https://launchpad.net/mysql-sandbox.

Then you need tar balls of MySQL server. You can download it from MySQL site.

After installing MySQL Sandbox you can run following script to install MySQL.

make_sandbox /path/to/mysql-X.X.XX-osinfo.tar.gz

This script will tell you some information like port, user name, and password which you can use to login to MySQL after installation. After confirmation it will install and run MySQL. That’s it! You are up and running.

If you want to install another MySQL you can just run the following command.

make_sandbox /path/to/mysql-X.X.XX-osinfo.tar.gz –check_port

The –check_port option checks the first available port so it can install and run on that port. By default it will use the MySQL version as port. For example if you have MySQL version 4.1.20 it will run MySQL on port 4120. And if it is not available then it will try 4121.

MySQL Sandbox provides other useful scripts to manage the server. So installing and running multiple MySQL, even different versions, is that easy 🙂

You can find the complete documentation at http://forge.mysql.com/wiki/MySQL_Sandbox#Single_server_sandbox.

Database indexes and locks

I was discussing with my friend on the issue I have discussed in my post Referential Integrity (http://mjawaid.wordpress.com/2009/04/01/referential-integrity/) and Mapping tables (http://mjawaid.wordpress.com/2009/04/02/mapping-tables/). My friend told me the scenario that when we create multiple indexes on the table we will get deadlocks. The reason he told me was the bucket lock, or in other words gap lock. Actually what server does is that due to multiple indexes when you lock any one record it locks multiple records, since it searches indexes and all the records it encounters during search, it locks them. That what I understood he was trying to say. I wasn’t convinced and thought of trying to create few tables with indexes and test them. For test I used MySQL4 and InnoDB engine, since the issue we were facing was on it.

So I created five tables indxtest1, indxtest2, indxtest3, indxtest4, and indxtest5.

CREATE TABLE `indxtest1` (

`id` bigint(20) NOT NULL auto_increment,

`name` varchar(255) default NULL,

`fk` bigint(20) default NULL,

PRIMARY KEY (`id`)

) ENGINE=InnoDB DEFAULT CHARSET=latin1 CHECKSUM=1 DELAY_KEY_WRITE=1 ROW_FORMAT=DYNAMIC

CREATE TABLE `indxtest2` (

`id` bigint(20) NOT NULL auto_increment,

`name` varchar(255) default NULL,

`fk` bigint(20) default NULL,

PRIMARY KEY (`id`),

KEY `NewIndex1` (`fk`)

) ENGINE=InnoDB DEFAULT CHARSET=latin1 CHECKSUM=1 DELAY_KEY_WRITE=1 ROW_FORMAT=DYNAMIC

CREATE TABLE `indxtest3` (

`id` bigint(20) NOT NULL auto_increment,

`name` varchar(255) default NULL,

`fk` bigint(20) default NULL,

PRIMARY KEY (`id`),

KEY `NewIndex1` (`id`,`fk`)

) ENGINE=InnoDB DEFAULT CHARSET=latin1 CHECKSUM=1 DELAY_KEY_WRITE=1 ROW_FORMAT=DYNAMIC

CREATE TABLE `indxtest4` (

`id` bigint(20) NOT NULL auto_increment,

`name` varchar(255) default NULL,

`fk` bigint(20) NOT NULL default ‘0’,

PRIMARY KEY (`id`,`fk`)

) ENGINE=InnoDB DEFAULT CHARSET=latin1 CHECKSUM=1 DELAY_KEY_WRITE=1 ROW_FORMAT=DYNAMIC

CREATE TABLE `indxtest5` (

`id` bigint(20) NOT NULL auto_increment,

`name` varchar(255) default NULL,

`fk` bigint(20) default NULL,

PRIMARY KEY (`id`),

KEY `FK_indxtest4` (`fk`),

CONSTRAINT `FK_indxtest4` FOREIGN KEY (`fk`) REFERENCES `indxtest5parent` (`id`) ON DELETE SET NULL

) ENGINE=InnoDB DEFAULT CHARSET=latin1 CHECKSUM=1 DELAY_KEY_WRITE=1 ROW_FORMAT=DYNAMIC

And another table indxtest5parent as parent for indxtest5.

CREATE TABLE `indxtest5parent` (

`id` bigint(20) NOT NULL auto_increment,

`test` varchar(255) default NULL,

PRIMARY KEY (`id`)

) ENGINE=InnoDB DEFAULT CHARSET=latin1 CHECKSUM=1 DELAY_KEY_WRITE=1 ROW_FORMAT=DYNAMIC

Let me explain what is the difference between the tables. All tables except the indxtest5parent contain three fields: id, name, and fk. Don’t confuse fk with a foreign key. Id is the primary key in all tables. The main difference between the tables is the indexing on the fk field.

The indxtest1 has no index on fk.

The indxtest2 has an index on fk, so fk is indexed.

The indxtest3 has a multi-field-index on id and fk combined, in addition to primary key index on id.

The indxtest4 has a composite primary key id, fk. Therefore there is a primary key index on id and fk i.e multi-field-index.

The indxtest5 has a foreign key fk mapping to id field of indxtest5parent. So it has foreign key index on fk.

After that I inserted some data in these tables, around just 10 records. I inserted that few records since I wasn’t testing performance with huge data, instead I was just testing that how records are searched in table with indexes, multiple indexes, and without indexes, which is useful in knowing how records are locked, implicitly when updating or explicitly.

So all tables look almost like this after inserting data:

id

name

fk

1

One

10

2

Two

12

3

three

13

4

Four

14

5

Five

15

6

Six

15

7

seven

14

8

eight

13

9

Nine

12

10

Ten

10

Now run the following queries on all tables:

select * from indxtest1 where id = 1;

select * from indxtest2 where id = 1;

select * from indxtest3 where id = 1;

select * from indxtest4 where id = 1;

select * from indxtest5 where id = 1;

These all queries will result in same output, i.e the first record of the table. This is very simple, and since result was filtered using the primary key in the where clause so it scanned only one record during search. We can see this by running following queries:

explain select * from indxtest1 where id = 1;

explain select * from indxtest2 where id = 1;

explain select * from indxtest3 where id = 1;

explain select * from indxtest4 where id = 1;

explain select * from indxtest5 where id = 1;

You will notice that in result the rows column will show 1, that means only one record was scanned during the search.

Now let’s filter the result using the fk field in the where clause:

select * from indxtest1 where fk = 10;

select * from indxtest2 where fk = 10;

select * from indxtest3 where fk = 10;

select * from indxtest4 where fk = 10;

select * from indxtest5 where fk = 10;

All these queries will return the same result, two records with id in 1 and 10. But how many records were scanned during search? To find out the answer run the following queries:

explain select * from indxtest1 where fk = 10;

explain select * from indxtest2 where fk = 10;

explain select * from indxtest3 where fk = 10;

explain select * from indxtest4 where fk = 10;

explain select * from indxtest5 where fk = 10;

In the rows column you will notice that for the indxtest1 table it scanned 10 records. That is reasonable since there was no indexing. Now let’s see for indxtest2 table, 2 records were scanned. This is also reasonable since fk was indexed. So far so good. Now for indxtest3 table 10 records were scanned. Hmm… ok we will discuss this in a moment. Let’s check other queries first. For indxtest4 table 10 records were scanned, and for indxtest5 table only 2 records were scanned. Result for indxtest5 table is also reasonable since it has a foreign key index on it.

Now what are the cases with indxtest3 and indxtest4? If you notice both tables have a multi-column index. According to MySQL documentation:

MySQL uses multiple-column indexes in such a way that queries are fast when you specify a known quantity for the first column of the index in a WHERE clause, even if you do not specify values for the other columns. (http://dev.mysql.com/doc/refman/4.1/en/multiple-column-indexes.html)

It is clearly stated in the documentation that when second column is specified, MySQL will not use the index, or even if second column is used with first column with OR condition, it will not use the index. That’s why queries on indxtest3 and indxtest4 scanned all 10 records during search/select.

Now what is the effect of indexing on locking? According to MySQL documentation:

A locking read, an UPDATE, or a DELETE generally set record locks on every index record that is scanned in the processing of the SQL statement. (http://dev.mysql.com/doc/refman/4.1/en/innodb-locks-set.html)

And what is record lock?

Record lock: This is a lock on an index record. (http://dev.mysql.com/doc/refman/4.1/en/innodb-record-level-locks.html)

Few more points from MySQL documentation (http://dev.mysql.com/doc/refman/4.1/en/innodb-locks-set.html):

1 – For SELECT ... FOR UPDATE or SELECT ... IN SHARE MODE, locks are acquired for scanned rows.

2 – SELECT ... FROM ... FOR UPDATE sets exclusive next-key locks on all index records the search encounters.

3 – UPDATE ... WHERE ... sets an exclusive next-key lock on every record the search encounters.

4 – DELETE FROM ... WHERE ... sets an exclusive next-key lock on every record the search encounters.

Conclusion

So, according to MySQL documentation, either we are explicitly locking records (first two points) or locks are implicit (last two points), locks will be acquired on records the search encounters. So if indexing is proper no locks will be acquired on rows on which we don’t want to. Even MySQL documentation says that:

It is important to create good indexes so that your queries do not unnecessarily need to scan many rows. (http://dev.mysql.com/doc/refman/4.1/en/innodb-locks-set.html)

Scanning many records will result in lock on those records due to which deadlocks can occur, performance can be degraded, or anything bad can happen.

If you are going to test the above scenario then you will also notice the performance differences between the queries if you have reasonable number of records in tables.

Many-to-many relationship

“A junction table, sometimes also known as a “Bridge Table”, “Join Table”, “Map Table”, or “Link Table”, is a table that contains common fields from two tables. It is on the many side of a one-to-many relationship with the other two tables. Junction tables are employed when dealing with many-to-many relationships in a database.” (http://en.wikipedia.org/wiki/Junction_table)

As described in the definition, mapping tables are used when there is a many-to-many relationship between two tables. In this case, a third mapping table is used to map the relation between those two tables.

But I have seen this in one of my employer products where they were using mapping tables in lot of places. The most common and used was in financial application, where data was very crucial. I’ll explain this with an example for education domain. Let’s take classic example of students and classes. A student can be in only one class and in one class there can be multiple students. Therefore, the relationship between student and class is many-to-one. Or from other side relationship between class and student is one-to-many.

So, ideally we will have a table structure with two tables say, STUDENT and CLASS. CLASS will have a PK say CLASSNUM and fields for other class details. Similarly STUDENT will have a PK say STUDENTNUM and fields for other student details. One more field that STUDENT will have is FK to CLASSNUM field.

This is the most obvious design of the scenario. But in the above mentioned application this was not the case. We had a mapping table between STUDENT and CLASS. Yes! You have read it correctly. We had a STUDENTCLASSDETAILS table between STUDENT and CLASS, having a composite key of STUDENTNUM and CLASSNUM. But what is the reason for a mapping table here? What I have heard, (since I was not the designer of that DB at that time) that this mapping table has been introduced to improve performance and resolve locking issues.

Well, let’s discuss how it can improve performance. Now consider the scenario where there are no relationships maintained in the database i.e. no referential integrity (Read my other blog on not having referential integrity – http://mjawaid.wordpress.com/2009/04/01/referential-integrity). So there are no FKs in any table. Now we have a STUDENT with STUDENTNUM (PK), other fields, and CLASSNUM (note that it is not FK, and is not indexed either). CLASS has CLASSNUM (PK), and other fields. STUDENTCLASSDETAILS has STUDENTNUM and CLASSNUM as PK i.e. composite key and these are not FKs.

Now if two users are performing any operation on same record then first transaction locks some records in the STUDENT table then another transaction will not be able to read that students information. In this case the STUDENTCLASSDETAILS will allow us to read the information, at least student number and his class, since this information is most accessible. This way the second transaction will not wait for the other transaction resulting in improved performance and first transaction will acquire the lock successfully.

Now let’s see what are the problems with this approach?

First is the maintenance. When moving student from one class to another (when promoting to next level) two tables have to be updated, resulting in performance hit. Second, mapping table allows student to be inserted in multiple classes and we have witnessed this, resulting in loss of data integrity which is very crucial in financial applications and one can lost a big amount of money just due to silly mistakes. I’ll quote a statement from another forum here:

The designer of an application has a fiduciary responsibility to his employer/client and needs to ensure that data is as acurate as possible. To not enforce referential integrity is to tempt fate. Employees get fired for building systems that contain bad data leading to bad business decisions. Consultants get sued.

Pasted from <http://www.access-programmers.co.uk/forums/archive/index.php/t-33531.html>


Although the person made a comment on not having referential integrity but the underlined statement has the crux of the topic. So the conclusion is when designing an application or database, also consider other scenarios and pros and cons of the approach being adopted, not only one scenario. In other words if one approach is solving your problem then also consider what other problems we can face due to it, and prepare for those as well.