Saturday, March 22, 2014

CrashPlan and JunOS Pulse conflict

I spent an hour or so today trying to get CrashPlan working on my linux machine.  The software installed just fine, and the backup engine appeared to start fine as well:

$ sudo service crashplan start    
Starting CrashPlan Engine ... Using standard startup
OK

However, the desktop portion of the CrashPlan software kept saying that it was "Unable to connect to the backup engine, retry?".  Weird.  So I checked to see if the server was actually listening on the port it's supposed to be listening on:

$ netstat -ln | grep 4243
$

Nope.  So it must not be starting up as "OK" as it claims.  Digging through some of the logs (specifically /usr/local/crashplan/log/service.log.0) I found this gem that shows up just before the service automatically shuts down:

[03.22.14 22:05:04.511 WARN    main                 com.backup42.service.CPService          ] >>>>> CPService is already listening on 0.0.0.0:4242 <<<<<

Wait, wat?  Something (it thinks it's itself, but that's not the case) is already bound to port 4242.  Netstat shows who actually is bound to that port:

$ sudo netstat -anp | grep 4242 
tcp  0 0 127.0.0.1:4242  0.0.0.0:*        LISTEN      11466/ncsvc
tcp  0 0 127.0.0.1:45982 127.0.0.1:4242   ESTABLISHED 11466/ncsvc
tcp  0 0 127.0.0.1:4242  127.0.0.1:45982  ESTABLISHED 11466/ncsvc

ncsvc!  That's the Juniper VPN software (aka JunOS Pulse).  Well that's no good, I'm almost always connected to the VPN.  Luckily, CrashPlan lets you configure the port that the server uses to something other than 4242!

The fix that made it work happily ever after was simply modifying a line in /usr/local/crashplan/conf/my.service.xml:

Change:
 <location>0.0.0.0:4242</location>
To a port that isn't already in use:
 <location>0.0.0.0:4244</location>

Now the CrashPlanEngine portion of the software runs smoothly even when I'm on the VPN!


Sunday, March 17, 2013

Apache mod_vhost_alias VirtualDocumentRoot per second level domain name

I wanted to have a virtual document root for each second level domain name (example.com, f85.net, abc.net, etc...), with subdomain virtual document roots located underneath the main domain name.

For example:

         f85.net --> /var/www/f85.net/f85.net
     www.f85.net --> /var/www/f85.net/www.f85.net 
www.news.f85.net --> /var/www/f85.net/www.news.f85.net 
 www.example.com --> /var/www/example.com/www.example.com 
 abc.sub.pri.com --> /var/www/pri.com/abc.sub.pri.com

And so on, so that each primary domain (that gets registered with a registrar) has its own folder in /var/www.

The problem was, everything that I was trying was failing to give the desired results, because I didn't fully understand how the 'Directory Name Interpolation' was working (how Apache turns the configured VirtualDocumentRoot string into a string that represents a discrete folder each time a file is accessed.

What ended up working was:

VirtualDocumentRoot /var/www/%-2.0.%-1/%0
VirtualScriptAlias  /var/www/%-2.0.%-1/%0/cgi-bin/

I'll break this down in pieces to explain how it's working:

/var/www/%-2.0.%-1/%0
This highlighted piece is the folder that holds all of the domains.  It never changes.


/var/www/%-2.0.%-1/%0
This highlighted piece references the second value from the right of the fully qualified domain name (FQDN).  For example, in the URL "http://www.f85.net", this will give us "f85".  When you use the - symbol, Apache starts counting from the right (so "%-1" would yield "net", and "%-3" would yield "www").  The ".0" following "%-2" is also important.  Although it doesn't technically do anything ("%-2" == "%-2.0"), it allows us to add the next item without errors.


/var/www/%-2.0.%-1/%0

This highlighted piece is a dot, plain and simple (up to and including this highlighted piece, we would have "/var/www/f85.").  The reason I wanted to dedicate a paragraph to it is because it gave me the most trouble.  The first thing I tried was combining "%-2" with a period ("%-2.") Logically this would give us "f85.", right? Nope!  When Apache sees this period following a "%-2" (or any other "%x"), it expects another number after the dot, as described in the Directory Name Interpolation section of mod_vhost_alias.  This second number is used to select a single character out of the first number's text (for the "www.f85.net" example, "%-2.1" would represent the character "f".  We can use the number 0 here ("%-2.0") to tell Apache to select the entire value and not some specific character, which allows Apache to finish parsing this "%x" phrase so that we can add our "." and have it actually represent a period without breaking anything!



/var/www/%-2.0.%-1/%0
This highlighted piece is the first value from the right of the FQDN ("net", in our "www.f85.net" example).  Pretty straightforward.  Up to and including this piece, we now have "/var/www/f85.net".  So far so good, just one last piece.

/var/www/%-2.0.%-1/%0
This highlighted piece represents the entire FQDN (with a forward slash in front of it).  Up to and including this piece, we're done!  We have "/var/www/f85.net/www.f85.net".

Now, this is definitely not the first way I tried to do this, but it's the only way that I've found to work consistently across sub-domains of various depths.  Please let me know in the comments if you see a problem with this method or have a better idea!!

Thursday, February 2, 2012

Xerces C++ Tutorial

I was never able to find a good xerces-c tutorial that walked me through parsing and reading a basic XML document.  I finally figured it out after hours of frustration, so I thought I'd share my results in a moderately complex tutorial that can be easily duplicated.  In this tutorial I will use xerces to iterate through an XML document, storing the XML data in a C++ STL map. I used the CodeLite 3.5 IDE on Ubuntu Linux 11.10 32-bit to develop and run this project.

This tutorial will assume you were able to get xerces installed and linked properly.  If you get stuck here, let me know and maybe I'll write a tutorial for that, but it would only be for the CodeLite 3.5 IDE.

To start with, here is the XML file I used: testdoc.xml
And here is the XML schema for that doc (important): schema.xsd
And finally here is the .cpp file that runs it all: main.cpp

Put these in the same folder as your C++ project.  For this tutorial, I will be using CodeLite 3.5 as my IDE, and I put these two files in the root of  my CodeLite workspace.

I'm going to walk you through that cpp file a few lines at a time, beginning here:

#include <vector>
#include <string>
#include <map>
#include <iostream>
#include <xercesc/parsers/XercesDOMParser.hpp>
#include <xercesc/dom/DOM.hpp>
#include <xercesc/sax/HandlerBase.hpp>

using namespace std;
using namespace xercesc;

That part was pretty simple.  If you get any errors in this section, you'll need to re-check your link to xerces.

Now, let's move on to setting up our data container (a STL map) and initializing xerces!

// This is where our data will go after it's pulled out of the XML file
map<string,pair<int,int> > myData;

// Initialize xerces
try { XMLPlatformUtils::Initialize(); }
catch (const XMLException& toCatch) {
    char* message = XMLString::transcode(toCatch.getMessage());
    cout << "Error during initialization! :\n"
         << message << "\n";
    XMLString::release(&message);
        return 1;
}
This portion of the code is the basic initialization call for xerces, and that's really all it does - we're still not doing anything interesting yet.  The one thing that is important here is our STL map declaration, which is the first actual code in the main function.  As you can see by the XML file we'll be using, there are 3 values for each data item (each data item is called a "word" in the XML), one string (the word text) and two integers.  Therefore, I decided to use a map, which is a fast way to store pairs of information.  The left item will be a string, the word value.  The right item will be another container - a STL "pair".  This pair will contain the two integers.


// Create parser, set parser values for validation
    XercesDOMParser* parser = new XercesDOMParser();
    parser->setValidationScheme(XercesDOMParser::Val_Always);
    parser->setDoNamespaces(true);
    parser->setDoSchema(true);
    parser->setValidationConstraintFatal(true);
    
// You'll probably need to change the string below, or you'll get a segmentation fault:
    parser->parse(XMLString::transcode("../testdoc.xml"));

This chunk of code creates the document parser, and sets a few attributes, mostly to make sure we get valid XML.  This will fail and the program will crash if you don't have the correct schema set up in your .XSD file.  This will also fail and crash with a segmentation fault if it can't find your XML document - so make sure that path is correct!  If you can't seem to get it to work with a relative path, try an absolute path.


DOMElement* docRootNode;
DOMDocument* doc;
DOMNodeIterator * walker;
doc = parser->getDocument();
docRootNode = doc->getDocumentElement();

// Create the node iterator, that will walk through each element.
try { walker = doc->createNodeIterator(docRootNode,DOMNodeFilter::SHOW_ELEMENT,NULL,true); }

The code above (I excluded all of the catch statements - I'm not going to deal with error handling here) creates a number of pointers that will be used for the rest of the code.  It also gets the parsed document and loads it into "doc", which then is used to load the root node (called simply "root" in our XML file) into "docRootNode".


// Some declarations
DOMNode * current_node = NULL;
string thisNodeName;
string parentNodeName;
bool wordParts[3] = {false,false,false};
string wordText = "";
pair<int,int> wordTypeValue;

This code consists of a few more declarations, used to hold temporary information as we loop through all of the elements of the XML document.  "thisNodeName" will hold the name of the node we're currently reading.  "parentNodeName" will hold the name of the current node's parent. "wordParts[3]" will hold 3 true/false values.  As we iterate through each of the 3 elements that make up a <word> (see the XML doc to understand this), we will load them into "wordText" (for the word itself) and "wordTypeValue" (a pair of integers, for the other two values).  As we come across these pieces of a <word>, the booleans in "wordParts" will be turned to true (one for each part of a <word>), and when they're all 3 true we'll know we'll have loaded all the info for one <word> into the temporary variables (wordText and wordTypeValue).  When this happens, we can load all of that data into one map entry (the map that was defined at the very beginning) that will represent one <word>.


for (current_node = walker->nextNode(); current_node != 0; current_node = walker->nextNode()) {
    
    thisNodeName = XMLString::transcode(current_node->getNodeName());
    parentNodeName = XMLString::transcode(current_node->getParentNode()->getNodeName());

The first line above starts the loop that will continue until we reach the end of the XML document.  It goes through the nodes one by one, making the data available through the current_node pointer.

The second and third line of code above assign the correct values to thisNodeName and parentNodeName each time we visit a new node (see the last code section for what these variables represent).


if(parentNodeName == "word" ) {
    if(thisNodeName == "wordText") {
        wordParts[0] = true;
        wordText = XMLString::transcode(current_node->getFirstChild()->getNodeValue());
    } else if(thisNodeName == "wordType") {
        wordParts[1] = true;
        wordTypeValue.first = 
            XMLString::parseInt(current_node->getFirstChild()->getNodeValue());
    } else if(thisNodeName == "wordValue") {
        wordParts[2] = true;
        wordTypeValue.second = 
            XMLString::parseInt(current_node->getFirstChild()->getNodeValue());
    }

This is where the magic happens!  As the program begins to iterate through the nodes of the XML document, eventually it will get to a value that we want.  As you can see from our XML document, the values we're interested in will always be directly underneath a <word> tag.  I used this as the logic for when to take a closer look.  That first if statement above will be true for any of the elements directly underneath a <word> tag, such as <wordText>.

The next three if statements evaluate the name of the current node.  The first node we hit after the <word> tag will be <wordText>.  This will put us inside the first nested if statement above (if (thisNodeName == "wordText").

So now we know that we're on the element <wordText> - we just need to know the value that it contains.  This is actually NOT in the current element!  The value inside the <wordText> tags is is actually in the first child element of <wordText>, which is a text element.  We access this element using the second line of code you see under the nested if, and assign it to our temporary holder wordText.  We also set the first of the three values in the boolean array to true, so that we know we've found the first item  we need to store the <word> data.

On the next iteration, we'll be on <wordType>!  We repeat the same basic process as we did with <wordText>, but instead of putting it's child's value into the temporary holder wordText,  we'll put it into the first element of our integer pair, wordTypeValue.  Another iteration later we'll be on <wordValue>, and we'll put it's child's value into the second element of wordTypeValue.

Now that we've iterated through all three of these elements, all three of the booleans will be true!


if(wordParts[2] && wordParts[1] && wordParts[0]) {
    myData[wordText] = wordTypeValue;
    wordParts[0] = false;
    wordParts[1] = false;
    wordParts[2] = false;
}

So now that all three booleans are true, the conditions for the if statement above will be true.  The first line inside the if statement loads the temporary values we pulled out of XML in the previous code segment into our map.  It also resets the booleans to false so that we are ready to start fresh with another word.


} else {
    // Not in a word
    wordParts[0] = false;
    wordParts[1] = false;
    wordParts[2] = false;
}

This is the code that executes if the parent element isn't <word>.  It will reset the booleans, so that we're ready to start over when we do find the elements with a <word> parent.


cout << endl << "STL map contents:" << endl << endl;

for ( map<string, pair<int,int> >::const_iterator iter = myData.begin();
        iter != myData.end(); ++iter ) {
    cout << "Word: " << iter->first << ", ";
    cout << "Type: " << iter->second.first << ", ";
    cout << "Value: " << iter->second.second << "." << endl;
    
}
cout << endl << "There are " << myData.size();
cout << " words loaded." << endl << endl;

This last bit of code just iterates through the map and outputs everything to the console, to show that the process worked.  If you don't understand this code, read up on the STL map and STL pair.

Not so bad after all!!!  Please leave me a comment if you can't get it to work, or have trouble understanding any part of the code.