Slashdot Log In
On PHP and Scaling
Posted by
CowboyNeal
on Sat Jul 03, 2004 08:37 AM
from the web-pages-served-fresh-all-day dept.
from the web-pages-served-fresh-all-day dept.
jpkunst writes "Chris Shiflett at oreillynet.com summarizes (with lots of links) a discussion about scalability, brought about by Friendster's move from Java to PHP. Chris argues that PHP scales well, because it fits into the Web's fundamental architecture. 'I think PHP scales well because Apache scales well because the Web scales well. PHP doesn't try to reinvent the wheel; it simply tries to fit into the existing paradigm, and this is the beauty of it.' (The article is also available on Chris' own website.)"
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
A few things that could lead to scalability (Score:5, Interesting)
Re:A few things that could lead to scalability (Score:4, Insightful)
Re:A few things that could lead to scalability (Score:5, Interesting)
Gah, no! (Score:5, Funny)
Allright, he used the word "paradigm", that makes his opinion automatically invalid.
Re:Gah, no! (Score:5, Funny)
Author seems to live in a vacuum (Score:5, Insightful)
Re:Author seems to live in a vacuum (Score:5, Informative)
(http://mike.van.lammeren.net/ | Last Journal: Tuesday June 10 2003, @11:10AM)
The article doesn't mention it, but Smarty [php.net] is an excellent PHP library that implements, among other things, caching. I have used it extensively with excellent results.
Re:Author seems to live in a vacuum (Score:5, Informative)
Personally, I find the lighter weight Savant [phpsavant.com] to be a better choice, since it's straight PHP (No syntax to learn either -- bonus!). That removes the need for Smarty's "compile into php"
step entirely, which has giving me MUCH better performance than when I was using Smarty. IMHO&experience, at least.
(And if you want caching, it can be done at the PHP engine level rather than in your templating engine -- see any of the PHP accellerators out there)
Re:Author seems to live in a vacuum (Score:4, Informative)
(http://feedharvest.com/)
But if you are running a site that can use the output caching that Smarty offers and the code is done properly, you will see huge speed increases as you can skip everything in the page including opening a db connection. Which gives very close to flat HTML performance.
As to using PHP accelerators, they don't handle output caching by themselves. You can code your own, but my time is better spent doing other things
Using Smarty and Turck together is pretty impressive.
PHP scales down, too (Score:5, Insightful)
(http://www.welton.it/davidw/)
What does this mean? That they don't consume too much in the way of resources, and are very easy to get started with. This puts a dynamic web site within reach of more people, which is a good thing, even if inevitably some of them will, yes, write crappy code. It is another example of the "worse is better" philosophy.
I just wish they had used Tcl or something else already out there instead of creating a language that in and of itself is nothing very exciting, and has been a bit slow.
Re:Not always a good thing... (Score:5, Insightful)
(Last Journal: Thursday April 21 2005, @12:15PM)
I hate to say it, but the problem exists between keyboard and chair. PHP is not inherently secure or insecure language. It may still have bugs, but those are a function of age and the serious ones have been taken care of. Rather, the problem is in the way people write software using PHP, without necessarily understanding the nature of the platform they are using.
It is not the job of the language to enforce security - it is the job of the programmer.
Another article (Score:4, Informative)
http://www.onjava.com/pub/a/onjava/2003/10/15/p
jsp is a bad idea, but Java is not (Score:5, Informative)
Re:jsp is a bad idea, but Java is not (Score:4, Informative)
(http://cumulo-nimbus.com/)
just as Velocity on it's own would be a bad idea.
Write your buisness logic in plain java, use servlets to manage the flow of control, and to call your java API to create value objects (beans) to place in the request, and then use JSP to format the data.
You only run in to problems if you try to do everything with JSP, which is always a bad idea, just as it's always a bad idea.
and JSP 2.0 is even better with the JSTL expression language built in.
Re:jsp is a bad idea, but Java is not (Score:5, Interesting)
Tapestry for the view, Spring for the control, and Hibernate for the model is a combination hard to beat with php. Sooner or later all these technologies will be used no matter what underlying language.
What's Really Going On Here... (Score:5, Interesting)
Sorry buddy... (Score:1, Insightful)
Re:Sorry buddy... (Score:4, Informative)
(http://www.hotgazpacho.org/ | Last Journal: Friday September 19 2003, @01:55AM)
See their explanation on why they use PHP [yahoo.com]
Definition of Scalable (Score:5, Insightful)
(Last Journal: Wednesday June 05 2002, @05:44AM)
His definition suits him well but it might not be helpful for me.
I might use scalable just to say that an application can easily (with little or no modification) handle 100x more users. This doesn't necessarily mean that the difference in system load varies a minimal specific amount per each extra request. All that matters is that it will work with higher demand. Who cares how or why.
I think scalable can also mean that an app can handle 10,000 users when hosted on a single machine but when put on a cluster of computers it can handle exponentially more users. To me that is a scalable application.
Scalable has no set definition in the contexts of applications.
Agree on defination first (Score:5, Interesting)
It is impossible to say php is or is not scalable unless a defination can be agreed on. And with "scalable's" current buzzword status, I don't see that happening very soon.
scalability is a dead issue (Score:5, Insightful)
(http://www.wdogsystems.com/ | Last Journal: Thursday October 06 2005, @10:10AM)
My big issue with PHP is maintainability- I see it (perhaps incorrectly) as a glorified templating language, which places it on the same evolutionary track as ASP and cold fusion; developers will tend to munge sql calls into the templates, blow off any MVC separation, and get a system that is very hard to keep going for more than a few revisions.
Re:scalability is a dead issue (Score:5, Interesting)
Yes, that is tempting. But, conversely, it's a very useful capability for small projects. For larger projects, you just need to ensure you have the discipline not to use the capabilities.
For instance, here [covcen.org.uk] is a site I developed in PHP using a strict model-view separation. There is direct linkage between view and controller and controller and model -- I couldn't be bothered to sort that out for a project of limited size like that one. In a larger project, I'd probably devise some kind of mechanism for that.
You can write unmaintanable code in any language you choose. Discipline is the key.
Who know what scales? (Score:3, Interesting)
(http://www.webalianza.com/)
PHP has a wide support for many RDBMS, APIs and Operating Systems, but it is only a Language. A language doesn't scale, it's the platform that scales.
That's why I see the PHP/Apache/Unix to scale far better than (for example) ASP/IIS/NT: The first platform can run from a PDA to a high-perfomance Minicomputer; The second can run from an I686 (pentium support was removed?) to the best PC-Architecture based computer you can buy. That's the difference: A wide option platform versus a closed option platform.
Probably, the first platform will have perfomance leaks and will not take every perfomance point from the machine it runs within, but its scalability potential resides that it can run in whatever you throw it at. Maybe J2EE or other platforms will run faster on the same hardware than PHP, but PHP will scale there and will be looking shoulder to shoulder to it.
That's why I don't like to valuate Scalability from the "speed" point of view, but the "where it runs" point of view.
Maintaining State (Score:1)
"Scalability is gained by using a shared-nothing architecture where you can scale horizontally infinitely. A typical Java application will make use of the fact that it is running under a JVM in which you can store session and state data very easily and you can effectively write a web application very much the same way you would write a desktop application. This is very convenient, but it doesn't scale. "
Storing and more importantly trying to replicate stored state via sessions in Java can be expensive, but saying Java scales badly because it makes it easy to do things that don't scale well is a poor argument. I don't know enough about the merits of PHP to comment on how it deals with this issue, but when you've done lots of server side Java programming you learn to be very judicious in the use of Session scope.
Yahoo. (Score:5, Interesting)
I worked in a small shop developing web apps, and while it wasn't mission critical stuff like banking, it wasn't exactly brainless "dump data from MySQL" stuff either. I was lucky that my boss wasn't picky about languages. But if anyone I work with doubts the power and simplicity of PHP, I usually bring up Yahoo.
IMHO, PHP rocks. It's suitable for pretty much any and all web development. It can be used for quick hacks, or you can code it like a pro with objects and stuff.
Re:Yahoo. (Score:5, Informative)
Yahoo is very much a C/C++ shop first and foremost - PHP is used as a template system (alongside several proprietary systems) to allow easy modification of high level behaviour.
PHP is not always good enough (was Re:Yahoo) (Score:5, Insightful)
Yes, PHP is excellent for web development. Yes, PHP can scale to even some large web sites. But since the web is still all the rage, this is unfortunately all that many people think about. Where PHP stumbles is when you need to move off the web or when you need to write complex business logic that is not solely driven by a web tier. PHP also fails when you need to integrate diverse transactional resources in an efficient manner. Not all business applications can be suitably implemented in PHP. As examples:
- PHP, by its scripted execute-and-terminate nature, cannot schedule the execution of tasks on its own. So, for example, there is no way to schedule an email to be sent at a specified time. If you need this sort of functionality, you'll have to look beyond PHP to ugly hacks like cron jobs that call PHP. (and then PHP scripts that can automatically modify your cron scripts..) Alternatively, you could write your own scheduler in a different language.
- Somewhat related, PHP is incapable of asynchronous operation. Suppose, for example, that we have a flood of customers placing orders. Our inventory database is fully capable of keeping up with the demand, but credit card processing system is backlogged and this is out of our control. So we cannot give users an immediate response as to whether their payment was accepted upon placing the order. We also don't want to make them wait 5-10 minutes after hitting the "place order" button for a response. The proper business solution is to accept the order, but send the customer an email later if the payment was rejected. This process requires asychronous operation -- queueing of the payment validation requests and possible further action separate from user interaction. PHP has no solution for this scenario or the many others like it and thus we must look beyond the PHP domain.
- PHP is quite weak when it comes to writing a complex business logic layer. This is not to say that it is not possible, but there are no frameworks available comparable to those offered in the Java world (and I'm not just talking about EJB, btw). So this is not a question of languages, but of available tools to do the job efficiently. For example, PHP has no concept of application-level transaction management. (declarative transactions, isolation levels, etc.) Looking towards the cutting edge, it has no support for Aspect Oriented Programming, which is an enormous boon to business logic developers, available in Java, C++,
- PHP is weak on tools for developing the persistence layer. For example, it has nothing comparable to Hibernate, let alone tools for RAD employing UML.
- PHP has no pre-built solutions for caching persistent data, and certainly not objects. Once again, it is possible, but developers are left to roll their own solutions using shm extensions or writing out to the database backend. Using the database can be terribly slow and even the shm approach requires (de-)serialization on script load/terminate. While this sort of thing does not limit scalability, it does limit performance (response times).
- PHP has no means of replicating application state in a cluster other than using the backend database. While this is often of no consequence, some complex business software holds a fair amount of state which needs not be persistent.
- PHP itself cannot reasonably be used to develop non-web clients such as a GUI tool for efficient rapid data entry or greater interactivity, a PDA client, or an embedded device that interfaces with a campus security system. These sorts of clients can talk to PHP scripts via SOAP extensions, but it should be recognized that we have again left the PHP domain to meet these needs and the resulting solution may not be the most efficient.
So in closing, PHP is great for some thing
Scalability and Maintainability go hand in hand (Score:4, Insightful)
(http://lobsteraliens.com/ | Last Journal: Friday November 01 2002, @12:16AM)
PHP will continue to have this problem until someone comes and tells the developers about a nifty invention called 'namespaces'
Some other things that could help: Standard templating for easier separation of design/content from code, a better module architecture that doesn't require me to recompile just to get some new functionality, some nice standard modules that go with that new architecture.
Of course if someone did all of that you'd have Perl and since we already have Perl, I'll stick with it.
Re:Scalability and Maintainability go hand in hand (Score:5, Informative)
(http://itsbeenconfirmed.com/ | Last Journal: Sunday May 04 2003, @02:33AM)
For the most part though, I would say that PHP is slightly better equipped for web development, just like Perl is better equipped for general scripting tasks... I'm a python man myself though
Re:Scalability and Maintainability go hand in hand (Score:5, Insightful)
This is particularly funny coming from a perl developer. Perl can become unmaintainable on a small project.
Implementing a site in PHP... (Score:4, Insightful)
(http://www.bigattichouse.com/)
The reason (Score:2, Insightful)
(http://risingcode.com/)
rebuttal (Score:4, Informative)
I will start with mandatory links to the great series of articles that Ace's Hardware ran, describing their server scenario and their migration from PHP to Java/J2EE:
The PHP Scalability Myth starts of by defining three types of server architectures. The first, two-tier, and the last, logical-three-tier, are the same conceptually (there is the slight distinction between whether display and business logic code is "mingled", but this is typically not a performance issue, but just an aesthetic or design issue). This two-tier/logical-three-tier architecture is the only one PHP supports natively. The article then proceeds to compare a two-tier PHP architecture against the most elaborate full three-tier Java architecture, which is used rarely in practice, and extremely rarely in the same domain in which a PHP solution is feasible. Instead of comparing apples and oranges (if PHP supported a full three-tier architecture, I would imagine two-tier PHP vs. three-tier PHP would have the same performance discrepencies), let's simply compare the only architecture PHP supports natively, two-tier, against JSP talking directly to a database, as this scenario is the most analogous to the PHP one. Let's also discard any caching as again this is something that Java handily accomodates but is not natively (or at least easily) available in PHP due to lack of state. And let's assume the database is the largest bottleneck.
The article states:
I'm not sure what "stub" the article is referring to, but I will assume it means an Apache module which talks a "native" protocol to the servlet engine. The first such module was mod_jserv, which could run the servlet engine both in-process and over a compact protocol called AJP (Apache Java Protocol), which represents essentially a pre-parsed HTTP requests. This module, as well as the AJP protocol itself has gone through severel revisions, from mod_jk, to mod_jk2. I cannot quite recall, but I think some version of mod_jk might have lost the ability to run in-process. Every other version, including the most current, can, if I recall correctly. This is besides the point, because as far as I know, AJP always has been a trivial performance overhead (I believe recent versions can run over Unix domain sockets). In fact, Apache is routinely used in production as the front-end web server, instead of the built-in servlet engine web server, simply because it is faster at serving static content, and that the AJP protocol is negligable. If the "stub" referred to in the quote is not the AJP module, then this may not be relevant, nevertheless AJP has always been highly efficient and typically negligable with regard to performance (the same typical connection min/max/idle count configurations apply as do to Apache itself).The article goes on to proclaim the complexities of caching and data object persistence which we have eliminated from our comparision. Let's move on to the real bottleneck - the database. The article says "PHP's connectivity to the database consists of either a thin layer on top of the C data access functions, or a database abstraction layer called PEAR::DB. There is nothing to suggest tha
Re:rebuttal (Score:5, Informative)
I'm not sure what you're on, but you can build however-many-tiers-you-like applications with PHP. In fact, PHP supports a number of technologies specificallly designed to communicate with additional tiers, including CORBA, JavaBeans and SOAP.
Let's also discard any caching as again this is something that Java handily accomodates but is not natively (or at least easily) available in PHP due to lack of state
PHP supports persistent state through shared memory blocks trivially. The implementation of data caching schemes that use this feature is not hard.
17 child threads attempt to connect, one will not be able to. If there are bugs in your scripts which do not allow the connections to shut down (such as infinite loops), a database with only 32 connections may be rapidly swamped
Why would you limit your database to serving fewer connections than you have limited your web server to?
PHP supports an option to kill runaway scripts and reclaim their resources after a time limit has elapsed, which handily prevents the infinite loop problems mentioned.
Ok, so now we have a bunch of "persistent" connections that hang around with the process. How long do they hang around?
Until the database closes them or the PHP server process is killed.
What if two threads in the same process want to use a connection?
The connection is locked from the moment a thread acquires it (using the *_pconnect function) until the script using it terminates.
In the worst case, persistent connections make your problem much much worse, because now you have many more connections open to your database.
What does an inactive open connection to the database cost? Not very much, in my experience.
Your arguments have a little merit, but please try to do your research before ranting about a system.
Let's find out. (Score:1)
(http://www.manandgoat.com/)
Real world examples? (Score:4, Interesting)
I think to settle this debate is a possible real-world example. Look at the story on the Jboss Nukes Project [onjava.com]. It explains the CPU utilization and speed of the PHP version and how moving to a J2EE implementation decreased the wait times dramatically.
Its difficult to argue with facts.
I can summarize it all (Score:2, Insightful)
2. Java scales well.
3. Friendster couldn't devlop a scalable J2EE application, so they switched to PHP.
4. WHat will Friendster switch to when they can't develop a scalable PHP application?
In other news... (Score:1)
Scalability is Not a Language Feature (Score:4, Interesting)
(http://inglorion.net/ | Last Journal: Thursday October 06 2005, @07:17AM)
Scalability depends on how you write your code. If your algorithms are good, your system will scale, and if they aren't, it will not. Any language that doesn't let you write good algorithms cannot be expected to be generally useful, but I think neither PHP nor Java fall in that category.
Finally, I think scalability is really not what's important, but rather performance. When developing tailor-made applications, I only care if they requires more or fewer resources for the number of requests they actually get, not for higher or lower loads. Of course, for libraries, operating systems, etc. the argument is different.
Scalability has little to do with language (Score:4, Informative)
- The skill of the developers implementing the system
- The foresight of the original plan/architecture design
- Understanding of where bottlenecks/growth problems will occur
Any project that doesn't plan the scalability in from day one will likely struggle to fix the problem when scalability does become an issue.
IMHO scalability is a design and architectural problem, the language used (within reason) makes no difference- it's the quality and structure of the design itself which will make or break the system.
PHP does not scale well, FACT (Score:1)
(http://www.ionpanel.org/)
It's not the language or the toolkit! (Score:2)
(http://bas.scheffers.net)
If you have a replicating database backend, use database connection pooling and stay away from sessions, you don't have the inter-JVM messaging problem.
When you do that, you can add as many database and web servers as y