HTTP Client Library for PHP

Update (Dec 2013): If you’re just looking for a good HTTP library in PHP, I currently recommend Requests.

As I mentioned in my last post, I’m currently spending a lot of time thinking about and coding PHP libraries for the various Open Stack protocols. I’ve recently hit a common roadblock with a couple of the libraries, and wanted to solicit some feedback from the community. Ironically enough, I’m stuck with regards to the absolutely lowest level in the technology stack that most developers have to deal with: HTTP. We need good HTTP support in the Open Stack libraries to do basic things like fetching metadata documents, and sending OAuth and OpenID requests. Some things to think consider:

  • we don’t need to worry about PHP 4
  • we need to be able to attach custom headers in the request
  • we need to be able to view the headers from the response
  • SSL support is a must
  • gzip compression is highly desirable as well
  • we need to limit any external dependencies outside of a standard PHP 5 installation, or as few as possible. (this of course doesn’t include any code we ship as part of the library itself)
  • I’m not aware of any need for support of HTTP cookies
  • it might be nice if particular platforms could provide their own HTTP handler. eh, maybe?
  • if we’re going to redistribute code with the libraries, the license is very important

Now PHP has myriad ways to make an HTTP request, which turns out to be the problem. Some of them are more robust than others, some are PHP extensions that are not bundled with the main distribution, some are bundled with PHP but rely on external libraries being linked into PHP when it’s built, and some rely on specific PHP configurations. Let’s review some of the most popular ones (and do leave a comment if I missed any):

  • sockets – this is the lowest level way of doing HTTP and uses basic Internet sockets. As such, if provides the most flexibility, but also the largest burden in having to deal with all of the low-level intricacies of HTTP. It has no external dependencies (aside from OpenSSL of course for SSL/TLS support).

  • file resource – uses fopen and related functions. Requires that the PHP configuration option allow_url_fopen be enabled. I don’t believe this supports custom request headers, but would love to be corrected on that. If it doesn’t, this is an obvious deal breaker.

  • streams – this is effectively a wrapper around fopen but adds the concept of a request context which can be used to include custom headers and such. Because this uses fopen under the covers, it also requires that allow_url_fopen be enabled.

  • HTTP extension – this is a PECL extension that does exactly what we need. Unfortunately, it’s not included in the standard PHP distribution, so we certainly can’t rely on it being present.

  • curl – one of the more popular methods, this uses the external library libcurl. This is a very robust and flexible URL library that can handle everything we want to do and more. It is relatively common on most PHP deployments, but because it relies on an external library, its availability is not guaranteed.

So there is no single solution that we can count on. What’s that you say? Provide an interface and then have multiple implementations that utilize several of the above methods? That’s a brilliant idea, but which one do we use?

  • php-openid – the current JanRain library currently has it’s own HTTPFetcher interface with support for libcurl and sockets. It’s written pretty specifically for the OpenID library, so would need some cleanup to be used in any other setting.

  • HttpClient – in my searching, I found this HTTP client library from Simon Willison. Don’t know much about the library, but I know Simon’s reputation and have seen his excellent XML-RPC library.

  • Snoopy – this seems to be a pretty popular HTTP client library. It has support for sockets and curl. No not libcurl, but curl the executable. Apparently, the developers weren’t comfortable with the stability of PHP’s curl functions, so they make a system call out to the curl executable.

  • WP_Http – WordPress has a pretty robust HTTP client library they wrote to replace Snoopy. It supports all of the above mechanisms (that’s actually where I got the list). That of course means that it’s pretty hefty, and probably provides more functionality than we really need. It is also pretty tightly integrated with WordPress, and would require a bit of work to re-use it in any other context.

  • HTTP_Request – PEAR extension that provides seemingly pretty robust support. I’m not sure what methods it supports, but I’d guess sockets and libcurl. The downside of course is that it comes with all the extra weight that a PEAR extension includes.

  • HTTP_Request2 – the successor to HTTP_Request above, written with PHP5 style objects. Looks like about the same level of support for HTTP, but less tested. And of course a PEAR extension.

  • NIH – we could, of course, always roll our own.

I’m sure there are others here that I’ve missed. If you have a serious suggestion that might work for what we need, please do include it in the comments below. If you have another to throw on the pile, but isn’t really realistic for the Open Stack libraries, don’t bother… I know there are a bunch out there.

So honestly, I’m at a loss. As with the Open Stack libraries themselves, the most important part is the interface. We can always switch out the implementation later, but I really don’t want to be rewriting code on a regular basis to match the HTTP interface du jour. I don’t have terribly strong feelings for or against any of the above, I just know we need something. So here it is… what do you all prefer in terms of an HTTP client library for PHP?

Comments and responses

Having worked with it myself, I quite like the WP HTTP library, and I think it covers everything you need (and, as you said, a bit more). It’s going to be kept up to date because it’s a core part of WordPress, so that’s a benefit, and it supports chunked transfers, gzip etc.

Oh, and as far as re-purposing it for use outside of WP, I already thought about this a little bit before. The main WP-specific stuff that it uses which you’d need to work around are the filters/actions that WP uses. I think you could handle this by basically just including a small wrapper that defined those functions, and made them do basically nothing (except return the value they were handed). This isn’t particularly slick or optimized, but it’d avoid needing to rewrite/modify the code every time it was updated at WP.

Here’s a list of WP functions used from a quick look:

  • has_filter()
  • apply_filters()
  • wp_parse_args()
  • get_option()
  • has_action()
  • do_action()

They could easily be “stubbed” to avoid unknown function calls. There are probably also some constants that would need to be defined to avoid warnings etc.

Just a thought.

yeah, I really like the idea of using WP_Http because I do know it will be maintained and I really like the interface… very clean and simple. I don’t want to stub out the WP functions, given that this is going to be a core part of the Open Stack PHP libraries. If anything, I’d maintain a simplified version of it… don’t really need cookie support or more than two or so of the available HTTP methods (curl and sockets, probably). It means I’d have to stay on top of changes, but I’m generally okay with that, especially given that they should only be bug fixes… HTTP isn’t changing any time soon.
Yep, fair enough. and as you say – the only changes will be feature additions (e.g. better support for SSL and some other things that are a little bit fluid still) and then bug fixes, so probably not that big of a deal to keep up to date with it.

I haven’t looked at it in awhile, but this might be interesting:

http://code.blitzaffe.com/pages/phpclasses/category/52/fileid/7

It implements the curl API in pure PHP if the curl PHP extension isn’t available.

@Joseph: that’s actually a really interesting idea, and looks like a decent implementation of it. I think a trimmed down version of WP_Http would be a bit smaller though, and I like that it provides a clean interface to code against instead of just the curl functions. But that’s a really cool library!

Native cURL in PHP is very common, but, as you say, not guarenteed.

For simple requests, the most common thing to do is try cURL if it’s around, and if it’s not, fall back to sockets/fopen. If there’s no cURL and remote_url_fopen is off, things become much harder. Thankfully this is extremely rare.

You might have forgotten Zend Framework HTTP component or choose to not include it because it is larger than the HTTP API that is in WordPress.

I’ve thought about stripping the WordPress dependencies from the HTTP API myself and releasing it as a stand alone library. Never got around to it. And it would still be GPLv2.

What you need to concern yourself, really, is the license. If you use the HTTP API from WordPress, than anything you create using it will have to be compatible with GPLv2, also the library will only be able to be used in GPLv2 software or compatible. Also, any software using it when running will become GPLv2.

If you are not okay with that, then you should look at the other libraries or do the NIH approach.

You will find that when developing your own library, then the PHP documentation isn’t entirely correct and omits bugs in PHP that need to be addressed in your code.

@Jacob: yes, I have certainly been thinking about the license issue. The other libraries I mentioned in my last post are either MIT or Apache, so this would be a departure. Personally, I wouldn’t be terribly effected by the license, since all the DiSo work is open source anyway. The bigger concern is users of the libraries… I’ll have to ping some of them and see if GPLv2 would be a problem for them. My gut tells me to stay away from it and leave no room for doubt, though.
hmm… not sure what to tell you. I’ve had success in the past with several of them. Perhaps make sure that you’re using a certificate from a trusted authority? (or disable verification)

Will, just a small aside, I would try to make sure you had cookie support. It’s been my experience that in almost all the cases where you don’t think you need it you run in to some edge case where you actually do need it.

Also, could you ask the WP people if they could more cleanly separate their HTTP library from their main code? Sounds like it could be generally useful.

For what it’s worth you can have custom headers with the file resources option. You can look at file_get_contents and the custom context options (Yeah, PUTs and DELETEs are possible). One annoying thing I found though, is that when you receive a 400 range response, the return is just false, and you can’t retrieve any of the information from the response or headers.
You might also want to look at Drupal's HTTP client at http://api.drupal.org/api/function/drupal_http_request/7 - it's pretty self contained. It references other Drupal specific functions but you could easily remove then. License is GPLv2.