Using PDFToText to convert PDFs to text

Using PDFToText to convert PDFs to text

PDF to Text ConversionThis is geared towards windows users. Pdftotext is a program for converting PDF files to text. for windows you can get it as part of the Xpdf open source viewer
http://www.foolabs.com/xpdf/download.html

From the README:

"What is Xpdf?
-------------

Using the HTML::Strip Perl extension

Using the HTML::Strip Perl extension

Stripping HTML/XML/SGML

Example demonstrating how to use the HTML::Strip Perl extension for stripping HTML markup from text.

The results may not perfectly remove all HTML depending on the complexity of your markup.
strips HTML-like markup from text in a very quick and brutal manner. You can also use the extension
to strip XML or SGML from text.

Code

Parsing RSS with Perl

This is based on a script provided in the 'Add RSS feeds to your Web site with Perl XML::RSS' from
http://articles.techrepublic.com.com/5100-6228_11-5487340.html

In the original script, it was assumed that the rss news feed would be located on your server. To get
around this limitation, use LWP to get the contents of a remote file, save it to a file on your server
then parse the file.

Using fgetscsv function

Using fgetscsv function

Reading the data saved in CSV file format

fgetcsv -- Gets line from file pointer and parse for CSV fields

Fgetcsv() example - Read and print entire contents of a CSV file
The field delimiter is assumed to be a comma, unless you specify another delimiter with the optional third parameter.

PHP Code

The Image Creation Capabilities of PHP

The Image Creation Capabilities of PHP

PHP image functions will allow you to create images dynamically

"The format of images you are able to manipulate depend on the version of GD you install, and any other libraries GD might need to access those image formats. Versions of GD older than gd-1.6 support gif format images, and do not support png, where versions greater than gd-1.6 support png, not gif." -- From the PHP Manual: Image Functions

PHP Sessions

A session allows you to preserve data across page requests (going from one page to another on a website) by saving the data to variables that you register as being part of a session. Sessions are tracked through cookies or through the URL. I prefer to track sessions through the URL because cookies may not be enabled on the client computer.

If PHP is compiled with --enable-trans-sid then URLs will contain the session id automatically via the constant PHPSESSID.

Uploading Files with PHP

"You can use a browser to upload files with PHP. The files can be either text and binary files. PHP has several file manipulation functions, that allow you to control what you do with a file once it has been uploaded, how large the file to be uploaded should be, where it should go after being uploaded etc.

To upload files, create a file upload form:

Example:

Flash Guestbook V2

Creating a guestbook in Flash using PHP script and a MySQL database

Setting up the database

You will need to have access to a server running PHP and MySQL. If you use another database, replace the MySQL database functions that I use in the PHP script with the database functions for the other database i.e. PostgresSQL, MSSQL.

First create two tables in the database:


# This table will hold the guestbook entries.
# Table structure for table `flashgb`
#

Fetching and saving the contents of a remote file

Fetching and saving the contents of a remote file

Submit a query to a search engine and save the results to a local file

1. Create a form that submits a query to a search engine.

2. Script should automatically save the query results to a file. Each query should be saved to separate files.

Working with regular expressions: HREF URL Extractor

HREF URL Extractor

Working with regular expressions

To extract the value (URL) of an HREF attribute from a string, use the preg_match or preg_match_all function. This example use preg_match_all to find all the matches.

preg_match_all - Perform a global regular expression match

preg_match_all (pattern (string), target (string), matches (array), optional flags)

Searches target for all matches to the regular expression given in pattern and puts them in matches in the order specified by the flags.

Pages

Subscribe to onaje.com RSS