captain holly java blog

data security through small cell suppression

December 22, 2009
Leave a Comment

It seems like the worlds of statistics and Java don’t talk to one another enough.

Small cell Suppression is a statistical term for not allowing users to be able to infer what should be private information from public sets of data. For example, consider a survey on athletes with staph infections that was queryable by age, county, sport and race. If there were statistically small number of hispanic wrestlers in Otter Tail County, you could probably guess who had a staph infection. So, if a population is identified as statistically vulnerable to this inference, then that data is suppressed.
The Washington State Dept. of Health page has a pretty good explanation:

Why are small numbers a concern in public health assessment?

Public health policy decisions are fuelled by information. Often, this information is in the form of statistical data. Questions concerning health outcomes and related health behaviors and environmental factors often are studied within small subgroups of a population. Continuing improvements in the performance and availability of computing resources, including geographic information systems, and the need to better understand the relationships between environment, behavior, and consequent health effects have led to increased demand for data on small populations. These demands are often at odds with the need to preserve privacy and data confidentiality. Small numbers also raise statistical issues concerning the accuracy, and thus usefulness, of the data.

In general, problems with confidentiality arise when there are small denominators (population size represented in a specific cell in a table); and, problems with data reliability arise when there are small numerators (cases in a specific cell in a table).

Definitions
The broader term for these controls is “Statistical Disclosure Control”. The challenge is to use optimal levels since too little control leaks public data and too much control makes published survey data useless.
“Imputation” is the practice of substituting values for missing data items. If we are leaving out data to protect confidentiality, then substitute data must be imputed so as to not skew the overall results.
“Inference”: The practice of finding secret data in published survey results. By measuring inference, we can find out if disclosure control is an issue.
Spearman’s Rank Correlation: a statistical tool for inference. It can find out how closely two variables are tied. This web page will perform this correlation for you (if you are ready to hand type your data into a web form).

I could only find one tool related to this in the java world. I’m surprised it isn’t more of a booming field since it touches on survey data, health and financial data, and security and privacy. Is that too small of a niche? I doubt it. Inattention to the dangers of leaking information in this way could potentially cause a lot of harm and cost a lot of money.

The stats package SAS has small cell suppression features. This document (Word Doc) discusses how to deal with the holes in the data that result from suppression.

So, how to have this feature in my java app?
R = the open source statistical package
CRAN = a list of packages for use with the R language
sdcTable: statistical disclosure control for tabular data
lPSolve: an R package that sdcTable depends on
rJava = an R package that allows R to create java objects and, through the JRI package that is now part of rJava, allows java run R in a single thread and make calls to it.
JGR = java GUI tool that makes use of rJava for a java GUI interface to R. R binaries must be installed and the JGR jar then allows java to call it. The source of JGR has good, production quality examples of how to call R from java.
Using all that, one should be able to create an ad-hoc query front end for survey data, run submitted queries through small cell suppression rules in R, and
return safe data.
There, I solved your small cell suppression problems. I’ll leave the details to the reader. What could be easier than integrating a stack of open source C and Java projects into your web app? or, rather, tune in for part II: implementing this stack O’ fun.


Posted in Uncategorized

IntelliJ Idea: Notes on switching

December 19, 2009
Leave a Comment

I recently switched over from working primarily with Eclipse/ MyEclipse and these are some large and small obstacles and how to overcome them.

  1. I want to ignore persistence framework errors. Go to Project Structure –> JPA facet Delete Data Sources Mappings (but not JPA Configuration Descriptor!)
  2. Web application doesn’t reflect changes to html, xhtml, jsp, etc. Go to Project Structure –> Java EE build settings. Make sure Exploded Directory Project compile output path is the same one the server is using (ie where your project lives on disk) Also make sure compile output path is the same as where your project lives and not some crazy intelliJ invented directory..
  3. I want editor to be linked with menu, like in Eclipse. This is autoscroll from source, a button in the top row of the project pane.
  4. I used Ctrl-shift R (for resource) all the time in Eclipse. In IntelliJ IDEA, the same function is CTRL-Shift-N (for name)
  5. Auto complete does not work! In my case, this was due to the La Clojure plugin (0.2.172) When I disabled this plugin and restarted, autocomplete (and several other features) came back. A web search on this turned up nothing. Maybe now it will.
  6. How to integrate CVS
    • If CVS is not connected, go to Version Control –> –> CVS –> Configure CVS Roots –> Test Connection. This appeared to reset the connection for me.
    • To Setup CVS repo Version Control –> CVS –> configure CVS Roots –> click “plus” button to make a new root. Enter your cvs info
    • Import existing project into your IntellJ IDEA File –> open project –> browse to find .pom file
  7. How to get vim keyboard mappings in intelliJ. go to settings –> plugins –> available –> right click on IDEAVIM to install. The step I skipped screwed me up big time: You must copy the keymap file according to these directions.
  8. I hacked the authentication mechanism on an app so I wouldn’t have to log in every time during testing, and I was afraid I might accidentally commit it to CVS. So I had to ensure this file never got mixed in with the rest of our code. This is a CVS question rather than an IntelliJ IDEA question, but the answer is to create a new branch. Right-click on file –>CVS–>create branch (name it “DEAD_BRANCH” or something) and check the “Switch to this branch” box. The next time you go to commit that file or the directory it is in, that file will show up as [switched to tag DEAD_BRANCH] and if committed, will only be committed to that branch, so that your co-workers, when they update, will not get your screwed up file.
  9. Keystroke goodness. The following keystrokes are indispensable. For a complete keystroke chart, go to help –> keystroke reference
    • move lines or blocks of code. This comes in handy on almost a daily basis and for some reason isn’t in the keystroke chart. Ctrl-Shift up arrow moves a line or selected block up. Ctrl-shift down arrow moves a line down. If it does not work, try hitting escape.
    • IntelliJ has a history of clipboard (buffer) contents. To paste from it, use Ctrl-Shift-V
    • Rename: Shift -F6
    • Generate Getters and Setters: Alt-Insert
    • Find usages: alt-F7
    • Duplicate Line or selection: Ctrl-D.

Posted in Uncategorized

Alternate languages on the JVM?

October 7, 2009
Leave a Comment

I’m trying to summarize several discussions about alternate languages on the JVM that I absorbed at the No Fluff Just Stuff conference. Can I become a language evangalist based on a weekend at a conference? I suppose not, but there were a lot of compelling arguments for why we should be looking at some of these new functional languages on the JVM. It was put forward that most of the reasons we like Java have to do with the JVM and not with the Java language:

  1. Cross platform
  2. stability
  3. Performance
  4. security
  5. huge world of libraries

These will hold true with any language that compiles to the JVM.

Why are they even considering new languages? Multiple reasons bubble up from conference as a whole.

Extensibility.
Discussed the example of hadoop. It is an open source framework that handles huge amounts of data in a distributed way. It is inspired by Google’s MapReduce papers. They evidently found some of the core java classes insufficient for their needs. If you look at the docs for org.apache.hadoop.io.text, it says, “It provides methods to serialize, deserialize, and compare texts at byte level…. In addition, it provides methods for string traversal without converting the byte array to a string.” Does this point to an extensibility problem in Java? If not, why couldn’t they reuse any code from String? Someone at the conference asked why can’t I make Object define toXmlString() so that every one of my classes that descends from Object automatically has a toXMLString() ? This is extensibility and Java doesn’t do it as completely as some other languages might.

A language shouldn’t limit what you can do. Certain language constructs not available in java (closures, switch statements, folding) enable developers to be far more efficient.

OO might be failing us. We try to think of Objects as changing in place. Rich Hickey, the creator of Clojure, rejects this: ” The future is a function of the past, it doesn’t change it. ” If we stop thinking of data as persisting and changing over time and instead recognize that a thing is immutable and when it changes it becomes a different immutable thing. Like a date, or an account balance. The state of an account a point in time is immutable. Adding money to it does not change it, it creates a new state. This 55 minute video of Rich Hickey explaining some of these ideas was recommended at the conference and is amazing. As he explains, all of our concurrency problems come from the notion of objects changing in place.


Posted in Uncategorized

No Fluff Just Stuff – twin cities

October 5, 2009
Leave a Comment

I learned a bunch of neat stuff over the weekend at NFJS. It was a wonderful combination of filling in the gaps for tools I use all the time and trying to show us what is coming in the future. The future, everyone agreed, was in alternate, functional languages on the JVM. I’ll talk about why in a separate post. The non-tech talks were all about agile development. At the end my brain was all stretched out and floppy. Today I want to go in a million directions at once.


Posted in Uncategorized
Tags:

struts form boolean checkbox

September 22, 2009
Leave a Comment

We all understand that when a checkbox is not checked on a form, it is not present in the request object. This is the basis for many headaches in web application programming, especially when using multiple form pages. When using multiple form pages, as in a wizard, the struts way around is to have a reset() method that contains some logic for setting the value to false if it doesn’t exist in the request. Again, this applies to situations with a session scoped form.

The documentation for the html:checkbox tag says:

WARNING: In order to correctly recognize unchecked checkboxes, the ActionForm bean associated with this form must include a statement setting the corresponding boolean property to false in the reset() method.
In practice, the only properties that need to be reset are those which represent checkboxes on a session-scoped form. Otherwise, properties can be given initial values where the field is declared.

public void reset(){
    this.citizen = false;
}

There are several confusing posts out there in forums about how to populate checkboxes when viewing forms with existing data. One says to have a hidden form field with the same name as the checkbox. Another has us jumping out of struts and using regular JSP tags with logic. Both of these are unnecessary and have potentially bad repercussions later.
The real solution is to use a html:checkbox with a name equal to that of a bean and the property equal to the name of the boolean variable in that bean that the checkbox captures. The following will check or uncheck the checkbox depending on the value of “citizen” in the applicantBean:

<html:checkbox name="applicantBean" property="citizen" value="true">

to work this, your code must invent an empty applicant bean before loading the blank form, or struts will whine that there is no such thing as “applicantBean” in any scope.


Posted in Uncategorized

how to use Apache Bench (ab) to test a page that requires login

September 10, 2009
Leave a Comment

ab is a tight and effective tool for load testing web applications. It comes with every install of apache httpd.
If a page is behind a login screen, you can use the -p flag to define a file that contains post variables for login and password:


C:\Apache2.2\bin>ab -p C:\posts\post.txt -T application/x-www-form-urlencoded -n
1000 -c 22 http://myServer/myapplication:8008/CentralCashier/userLogin.do

If a page is only accessible by a logged in user, not directly accessible from the login page, then you can use the -C flag to define a cookie. You have to get the value of the session identifier cookie from a valid session. Use a proxy like Webscarab or Paros to capture a request and copy the JSESSIONID=xxxxx from the request and use it with ab:


C:\Apache2.2\bin>ab -C JSESSIONID=36D5AE14223E1D4ED0B2BBC5C7F411EA -n 1000 -c 22 http://myServer/myapplication:8008/CentralCashier/userSearch.do?method=search

Alternatively, you can just turn off the authentication filter for the purposes of your test.


Posted in tomcat

Evaluating WebScarab

July 29, 2009
Leave a Comment

I was asked to do a security assessment on a co-worker’s Cold Fusion application. It is protected on every page by a NOT findnocase(cgi.http_host,cgi.http_referer) check to ensure the request came from the same domain. This is a good way to prevent forced browsing and most url injection attacks because if you mess with the URL, this tag knows it and stops all the shenanigans.
This is where a proxy comes in. I’ve worked a bunch with Paros and some with Burp, but my employer does not allow me to download these without some extra paperwork. Webscarab, for some reason, is allowed. Webscarab is written entirely in Java, has a zippy UI and has widening adoption.

Webscarab allowed me to do forced browsing on the application and learn that the application relied solely on that domain check to make sure the user was authenticated (That is, they could only get to the site through the login form). Webscarab also allowed me to find many XSS bugs.

Webscarab is infinitely scriptable (with beanshell).

Webscarab has a tool that evaluates session identifiers for their strength. I would guess that most web frameworks these days have very strong session identifiers. In fact, I challenge anyone to find an example of a weak session identifier on any web app that shouldn’t be replaced anyway for one hundred other reasons.

Startup Options
Webscarab starts in Lite mode, which is just the web proxy, by default. To get the full meal, you have to start with java -DWebscarab.lite=false -jar webscarab.jar
Default memory is 64MB and this can get used up quickly. Online examples show webscarab having ~510 MB available. This is achieved by adding -Xms32m -Xmx510m to the java startup args. Just like with some other java desktop apps (Like IntelliJ Idea) you can click on the Green|Yellow|Red bar along the bottom of the window to force garbage collection and free up some memory.

Things That Could Be Improved:

  1. Inconsistency: Some features are available through a right click, some through a double click, some from a menu item and others from buttons or tabs somewhere on the screen. Some fields look editable but aren’t. Some are editable on one click, others on two. Some edit fields select the whole field when clicked, but typing appends to the end of the existing entry.
  2. Other screens have a delete button. Not the Proxy Listener Tab. To delete a listener you must stop it. If I a listener fails to start, it may not be stopped and so cannot be deleted. I have to stop any other service using the same port as my listener, THEN start my listener, and THEN stop my listener to delete it
  3. The interface for getting rid of conversations is difficult to use. Webscarab can fill up pretty fast with banal conversations and the only easy way to get rid of them all is a restart. There is a Tools –> remove conversations menu item, but no regex that I enter seems to get rid of conversations.
  4. There should be some way to construct the proxy filters based on existing requests. By this I mean when a request is trapped that you never want to see again, you can flag it in some way to add it to the ignore list.
  5. Judging from several posts to the mailing list, Webscarab only works with Sun’s brand of java.

To address user experience as well as other issues, Webscarab is undergoing a total rewrite. This is currently known as Webscarab NG. They will be using the Spring Rich Client Platform. The new product also has database integration. This is a work in progress and needs lots of testing. So, if you are looking for an open source project to help, this would be an excellent choice. According to the email list, the Webscarab NG project leader has been directing his work at the OWASP Proxy lately. Even though Webscarab NG is in development, development also continues on the current Webscarab.


DRPL: definitely broken

July 28, 2009
Leave a Comment

The Denied and Restricted Parties List (DRPL) is kind of a No-Fly list for export restrictions and since Sun has some encryption related technology, it is a national security concern that someone might take the SCJP exam.

After initially being informed that my request to take the exam was denied, today I got an email from SUN saying that I’m not, after all, someone who might do bad stuff.
here is some background email that was attached to my email:

The following individual, as a result of screening, has been identified as being as a potentially non-compliant export customer:

Search Key: US2121923
First and Last Name: Tim McGuire

City: St. Paul

Country: US

Result of DRPL check: Detected

Date and time of denial: Mon Jul 27 10:10:15 MDT 2009

Course Order Numbers: No Numbers Generated.

Reason for denial: Not Available from service.

From here and here I see that a bunch of people have been inconvenienced and aggravated to varying degrees because some mouth breather hasn’t figured out that no-fly lists are a fake idea. Imagine if Macy’s did a surprise sniff test on every 100th customer in their underwear department? This is just like that.


Posted in Uncategorized

Slowloris vs tomcat

June 19, 2009
Leave a Comment

RSnake has been thinking about a denial of service attack against web servers that involves sending partial http packets to use up number of allowed clients. Sending carefully crafted partial packet causes the server to take A LONG TIME to work on the response to your request, using up its resources and becoming temporarily unavailable to other visitors. Apache HTTPD is mentioned as a server that is vulnerable. IIS is mentioned as one that is not. RSnake, being a realist and not an anti-microsoft evangelist, often says things that make the open source advocates uncomfortable. (“PHP is the bane of my existence” and “Whenever I assess a dot net application I know right off the bat that I’m going to find half the number of vulnerabilities”).

A few notes about Slowloris: It can’t effectively dos a box from windows because it works by creating hundreds of Sockets and Windows only allows a max of 130. It doesn’t crash anything, so it is a gentle tool(haha) It just happens to make web applications unavailable for as long as the attacker wishes. It does, by the way, send out hundreds of packets so it is detectable by the administrator.

To use Slowloris, first establish a timeout for the web server you are attacking:
./slowloris.pl -dns http://localhost -port 8080 -test

this should return some numbers to use for a timeout.

They don’t mention tomcat, so I spent most of the afternoon setting up a machine to see if this tool can DOS tomcat.

drum roll please….

slowloris test

slowloris test

clear that we’ll be using a 5 second timeout for TCP and a 30000 millisecond timeout for http.
then,
./slowloris.pl -dns localhost -port 8080 -timeout 30000 -num 500 -tcpto 5

the above opens 500 sockets and uses a tcp timeout of 5 seconds and looks like this:

slowloris execution

slowloris execution

now, try and connect to the benighted tomcat server.
hmmm. works fine. What gives?  I suspect that as this number of connections (500), I am still able to get a connection.  The first visit takes a really long time, but once I get through, I can use the site normally.  This matches the statement in the documentation that “    “.  If I raise the number of connections….  It still takes a very long time to load the first page, but thereafter is just as easy to access the application.

When I run slowloris on the same server, however, tomcat is completely DOS-ED.    I’m impressed with the absolute unavailablity of tomcat in relation to the low level of traffic that slowloris generates.

ooooh.   I thought I was supposed to convert 30 seconds into milliseconds.  wrong!  setting the timeout this high  (30,000 seconds) is clearly too high.  When I set it down to 30, slowloris CRUSHED tomcat.  remotely or locally.  As you can see below, setting the timeout correctly allowed many more packets to be sent.

slowloris success

slowloris success


Posted in security, tomcat
Tags: ,

How do you like jadclipse?

June 5, 2009
Leave a Comment

I installed jadclipse to see if it was a way to view source of all the libraries I use but don’t have the source code for.

Installation is simple, with the small extra step of downloading jad and telling eclipse where to find the jad.exe. (Window –> Preferences –> Java –> Decompilers –> Jad)

All the jad command line options are represented in the jad dialog.

inner classes and static blocks were decompiled just fine. I did notice that jad left out any braces in if-then blocks that weren’t needed. Nitpicking a bit here, but I can’t read code without braces. The braces were put in when I checked the “show redundant braces” options, but not until I restarted eclipse.

Finally, some classes seem to have a stuck setting causing them to open in class viewer. For instance, Struts Action gets decompiled just fine, but Struts DispatchAction just shows itself in that awful class viewer. Restarting Eclipse made this problem go away too.

Big thumbs up for this very useful eclipse plugin.

thumbs up

thumbs up


Posted in Uncategorized
Next Page »

About author

The author does not say much about himself

Search

Navigation

Categories:

Links:

Archives:

Feeds