24
Mar
19

Roll your own end-to-end solution for hosting your next BIG Project

Recently, a great friend – that happens to also be a former ex-business partner – contacted me to discuss how we could apply machine learning to reduce the costs associated with an activity related to the financial department of every company here in Brazil – and worldwide I guess. To be more precise, he wished to understand if it was possible to use any classification algorithm to automatize part of this process.

After a few days performing a preliminary investigation we found the problem was worth it and we decided to move forward with a PoC. Even though he already had a few servers for hosting our PoC, I decided that I wouldn’t risk interfering with his production environment and that instead I would setup a minimal server under any cheap hosting solution. Ideally I wished I had the same infrastructure I am currently working with: a Kubernetes cluster running on top of Rancher Server but this would require at the bare minimum one server with more than 2GB of memory (with overlapping planes, that I personally don’t recommend). Since such a huge environment was out of consideration for a PoC I opted to stick with a machine running Rancher OS, this way I could still have at least a minimal Linux that could be able of running Docker containers. But still one thing surprised me: Rancher OS doesn’t run on machines with less than 2GB of memory, my first attempt was to run it on a machine with only 1GB and it got stuck on an endless loop boot. But it wasn’t a huge problem since it only increased my monthly fee from U$5 to U$10 and I got twice the memory and more storage.

Infrastructure

After setting up the Rancher OS instance I started to setup the stack for hosting the code and also running the solution after the first week of coding.

Git Hosting

I could have chosen Bitbucket for hosting my code but I thought it would be better to run something on my own premises, Gitlab could have been an option but its HUGE memory footprint is a no-go. Then I found a thread on reddit mentioning gitea as an interesting alternative to Gitlab. I decided to give it a try and it was a huge surprise! It has an impressive low memory footprint: 45Mb when idle and occasionally it spikes up to 60Mb and goes down again, not to mention that it had everything I needed.

Reverse Proxy

Then it was time to setup the reverse proxy that would take care of routing every HTTP(S) request to one of my services – No kubernetes, no ingress. Remember? – be it part of the dev infrastructure, be it part of the solution itself. Nginx to the rescue! The most simple setup I could think of was to have a directory for hosting a shell script that was responsible for running the Nginx container and doing a bind mount for the default.conf file (from the host pwd/default.conf to the container /etc/nginx/conf.d/default.conf), the file simply had a bunch of server sections with server_names and proxy_pass directives.

By that time, this was how the server looked like:

I was already using DuckDNS to avoid having to memorize the server ip address any time I had to SSH to it. Then I realized I needed HTTPS and therefore host names, for a PoC in such initial stages, buying a domain would be an overkill, so, DuckDNS again, this time for the actual solution.

Free HTTPS certificates

If spending with the domain wasn’t being considered by that time, with the HTTPS certificate was also a no-go. But with the advent of Let’s Encrypt we can now have SSL on our solutions without spending a penny. And requesting the certificate is even easier if you have the possibility of running CertBot’s public available docker container. As I wasn’t willing to investigate how to use certbot nginx plugin, I opted to run its simplest mechanism of issuing a certificate: the one you provide it a public available directory on your HTTP server in which it writes the content it receives from CertBot service during the certificate granting process.

Lightweight CI/CD

After struggling to run git on RancherOS (I even tried to run git as a container but I had so much file permissions issues that I gave up soon on this approact – repositories were cloned as root) I thought it’d be a good reason to anticipate the deployment of a CI/CD solution. I ended up ditching Jenkins for the same reason I had already dropped Gitlab: impressive memory footprint. After some research I found Buildbot: a python based minimalist CI/CD solution. Apart from the stock Buildbot Master Container, I’ve created a few worker containers: one based on stretch with support for PyEnv (in order to have a good support for scikit), a similar one based on Alpine, one Alpine based with Node (for building the UI) and a few other base containers (check them out at my Dockerhub account).

Databases

Finally, I had to deploy MySQL and PostgreSQL for both the dev stack and also for my own solution that was being developed. PostgreSQL was deployed as is but for MySQL I opted to slim its memory usage a little bit by following this post.

Wrap up

The project I am working on is based on Python and uses scikit-learn, Flask and SQLAlchemy with Alembic for the backend (running on Waitress) and Angular for the frontend.

The following picture provides an overview of the current containers and components running on my U$10 server:

The idea of this post was to give the overview of a recipe on how to build a cheap but comprehensive solution for hosting you next BIG idea. In the next post I’ll try to drill down on the details of setting up each of the components that were used. Feel free to comment if you have any questions.

 

 

Advertisements
01
Jun
15

Beginning on Node.js

Recently I started learning Node.JS as a result of a few projects developed on a startup I recently joined. We are using Parse.com on some of our projects and it happens to be based on Node.JS. Also, while learning Backbone.JS and searching for a solution to unit test our Backbone.JS code I stumbled upon Sails.JS. Sails.JS drew my attention immediately, its productivity was stunning. Suddenly, Javascript, something that only recently I started to give some credit (due to Backbone.JS and some other frameworks but ALL focused on the UI), was looking as a promising solution for a great niche of application.

While studying the Node.JS platform I soon realized that some of the similarities between what can be seen on the Java <-> Android is true for Javascript <-> Node.JS. Node.JS happens to run Javascript code but it isn’t simply Javascript. The same is true for Android as we can’t say that Android isn’t simply Java. Java presumes a lot of things that isn’t possible on Android, eg.: ClassLoading on Android doesn’t work as on Java, it seems that you are limitted to a DexClassLoader (and as we are on the Dex subject it is THE sign that Android isn’t really Java otherwise it would simply run Java bytecode).

But what about Node.JS (I’ll refer to it simply as Node from now on)? Although Node runs Javascript it has some unique things that form the Node platform, one of them is the process global object. One of its most known functions is the process.nextTick(callback) that allows code to schedule callbacks to be immediately executed on the next run of the event loop (not exactly the next run but it is guaranteed to run before any I/O event). Another important characteristic of code running inside node – but this one was drawn ipsis litteris from javascript – is the run-to-completion. This characteristic allows javascript code to place event listeners on an EventEmitter without risking the event firing before the listeners are set since the I/O code would only be able to run on the next execution of the event loop. I’ve talked a lot about the EventQueue but I’ve not mentioned that Node runs on a single thread and this alleviates the issues related to thread contention but we STILL have race conditions since we have code running concurrently (between callbacks), eg.: if you have a code that checks whether a file exists (async code since it is I/O based) and downloads that file if it does not exist you may risk having that file overwritten since the callbacks that checks for the file must be run on after the other and then the two downloads would proceed as the file did not exist. So how to scale Node applications? Node provides a core module named cluster that allows for a model similar to Apache MPM Prefork.

Enough of Node for today… I’ll try to come back soon and talk about NPM (another node cornerstone) and Sails.JS since Sails.JS is a REALLY interesting framework that draws from a lot of best practices and concepts found on lots of other frameworks (and still produces really clean and organized code).

 

20
Aug
12

Designing an ASTERIX Protocol Parser

In the next weeks I’ll try to describe the steps taken in order to decompose ASTERIX, Eurocontrol variable length binary protocol for Radar message Exchange. According to Eurocontrol, ASTERIX Stands for: All Purpose STructured Eurocontrol SuRveillance Information EXchange. Try not to confuse it with Asterisk (the opensource telephony project).

One of the major challenges was to achieve a great reuse of the domain classes designed to mirror the protocol fields since ASTERIX employs a design where some packet types are simply extensions of previous released versions, therefore sharing a huge amount of fields. One example is the Cat034 and Cat002 packets. If you are unfamiliar with ASTERIX, Eurocontrol site has a great deal of documentation on ASTERIX format.

22
May
11

JSF Datatable with database sorting, filtering and pagination

When database grows (or may grow) significantly it is necessary to consider database pagination for loading data since the Application Server memory isn’t sized for handling such huge amount of data.

Luckily rich:extendedDataTable provides an easy way of delegating the sorting, filtering and also pagination to the underlying data store. I sincerely remember that the I saw the base for this idea somewhere in Seam Framework forum (if anyone knows something similar please let me know) but I couldn’t find it for referencing, anyways it is now extended and more flexible. Another foundation of this technique is the org.jboss.seam.framework.Query since it provides means of expressing filtering and sorting as a huge string even with JSF variables (eg.: #{something.otherthing}).

Lets start with the EntityExtendedTableDataModel that is responsible for passing filtering and sorting data from view layer to data layer:

<pre>/**
EntityExtendedTableDataModel - Database pagination, sorting and filtering for richfaces datatables
Copyright (C) 2011 Rafael Ribeiro

This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA
*/</pre>
import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

import javax.faces.component.html.HtmlInputText;
import javax.faces.context.FacesContext;
import javax.faces.convert.Converter;
import javax.faces.convert.ConverterException;

import org.apache.commons.lang.StringUtils;
import org.jboss.seam.core.Expressions;
import org.jboss.seam.core.Expressions.ValueExpression;
import org.jboss.seam.framework.Query;
import org.jboss.seam.ui.AbstractEntityLoader;
import org.richfaces.model.DataProvider;
import org.richfaces.model.ExtendedFilterField;
import org.richfaces.model.ExtendedTableDataModel;
import org.richfaces.model.FilterField;
import org.richfaces.model.Modifiable;
import org.richfaces.model.Ordering;
import org.richfaces.model.SortField2;

public class EntityExtendedTableDataModel extends ExtendedTableDataModel implements Modifiable {
EntityDataProvider entityDataProvider;
public EntityExtendedTableDataModel(Query query) {
super(new EntityDataProvider(query));
entityDataProvider = (EntityDataProvider) getDataProvider();
}

public void modify(List<FilterField> filterFields,
List<SortField2> sortFields) {
performSort(sortFields);
performFilter(filterFields);
}

private void performSort(List<SortField2> sortFields) {
StringBuilder order = new StringBuilder();
Pattern p = Pattern.compile("\\#\\{(.+)\\}");
for (SortField2 s: sortFields) {
if (Ordering.UNSORTED.equals(s.getOrdering()) == false) {
String expr = s.getExpression().getExpressionString();
Matcher m = p.matcher(expr);
if (m.matches()) { //remove the #{} otherwise richfaces wont trigger the sort event
order.append(m.group(1));
}
else {
order.append(expr);
}
order.append(" ");
order.append(Ordering.ASCENDING.equals(s.getOrdering()) ? "ASC" : "DESC");
order.append(", ");
}
}
if (order.length() > 0)
entityDataProvider.getQuery().setOrder(order.delete(order.length()-2, order.length()).toString());
}
//allows us to specify filter as o.name == #{exampleEntity.name}
private void performFilter(List<FilterField> filterFields) {
Expressions expressions = new org.jboss.seam.core.Expressions();
Pattern p = Pattern.compile(".*(\\#\\{.+\\}).*");
List<String> restrictions = new ArrayList<String>();
for (FilterField f: filterFields) {
ExtendedFilterField e = (ExtendedFilterField) f;
if (StringUtils.isEmpty(e.getFilterValue()))
continue;
StringBuilder filter = new StringBuilder();
String expr = e.getExpression().getExpressionString();
Matcher m = p.matcher(expr);
if (!m.matches())
continue;
ValueExpression ve = expressions.createValueExpression(m.group(1));
FacesContext ctx = FacesContext.getCurrentInstance();
Converter c = ctx.getApplication().createConverter(ve.getType());
if (c == null)
ve.setValue(e.getFilterValue());
else {
try {
ve.setValue(c.getAsObject(ctx, new HtmlInputText(), e.getFilterValue()));
} catch (ConverterException ce) {
continue;
}
}
restrictions.add(expr);
}
entityDataProvider.getQuery().setRestrictionExpressionStrings(restrictions);
}

}
class EntityDataProvider implements DataProvider {

private Query query;
public EntityDataProvider(Query query) {
this.query = query;
}

public Object getItemByKey(Object key) {
return AbstractEntityLoader.instance().get(String.valueOf(key));
}

public List getItemsByRange(int firstRow, int endRow) {
query.setFirstResult(firstRow);
query.setMaxResults(endRow-firstRow);
return query.getResultList();
}

public Object getKey(Object item) {
return AbstractEntityLoader.instance().put(item);
}

public int getRowCount() {
return query.getResultCount().intValue();
}
protected Query getQuery() {
return query;
}
}

Now we have to replace Seam datamodels component with ours:

<pre>EntityExtendedTableDataModel - Database pagination, sorting and filtering for richfaces datatables
Copyright (C) 2011 Rafael Ribeiro

This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA
*/</pre>
import static org.jboss.seam.ScopeType.STATELESS;
import static org.jboss.seam.annotations.Install.FRAMEWORK;

import javax.faces.model.DataModel;

import org.jboss.seam.annotations.Install;
import org.jboss.seam.annotations.Name;
import org.jboss.seam.annotations.Scope;
import org.jboss.seam.annotations.intercept.BypassInterceptors;
import org.jboss.seam.faces.DataModels;
import org.jboss.seam.framework.EntityQuery;
import org.jboss.seam.framework.Query;

@Name("org.jboss.seam.faces.dataModels")
@Install(precedence=FRAMEWORK)
@Scope(STATELESS)
@BypassInterceptors
public class RichFacesModels extends DataModels
{

@Override
public DataModel getDataModel(Query query)
{
if (query instanceof EntityQuery)
{
return new EntityExtendedTableDataModel((EntityQuery) query);
}
else
{
return super.getDataModel(query);
}
}

}

Finally we specify in rich:column the filtering remembering that the base object name here must be in sync with the referenced query, example:


<rich:extendedDataTable id="listaPacientes" rows="10" value="#{queryAllEntities.dataModel}" var="obj">
<rich:column sortable="true" sortBy="#{o.name}" selfSorted="false"
filterBy="lower(o.name) like concat(lower(#{exampleEntity.name}),'%')" filterEvent="onkeyup"
label="Name">

The trick here is to specify selfSorted=”false” and adding the JSF #{} part for the sorting to work, this way o.name will be appended to Query sorting.

On the other hand, for filtering to work you only need to specify your query, note that what you specify between the #{} will hold the filter data handed to the Query.

Also note that you need to have a query and an entity as a component specified in components.xml, in this example, this could be the query and the example entity:


<component name="exampleEntity" class="br.com.rafaelri.MyEntity" scope="session" />
<framework:entity-query name="queryAllEntities" ejbql="select o from MyEntity o" />

This, combined with rich:dataScroller will result in sorting, pagination and filtering handled by the Database.

21
May
11

Enum backed h:selectOneRadio and h:selectOneMenu

Recently I needed to display a h:selectOneRadio and a h:selectOneMenu with values provided by a Java Enum. After some research on seamframework forum I saw one post that clarified a lot the way I should pursue, specially Pete’s comment. My greatest concern was the same of Pete: wiring up presentation and domain layer. But, after some analysis I found a simple solution that does not wire up domain and presentation layer and also respects the DRY principle. The solution involved crafting a class that would be instantiated as a Seam component on components.xml so it can be instantiated for each and every enum that we want to provide on view layer and also used EnumSet.allOf method so we could automagically iterate for each of the enum values.

Below, you can find the EnumList class:

/**
EnumList Seam Component - Converts Java Enums to List.
Copyright (C) 2011 Rafael Ribeiro

This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA
*/
import java.util.ArrayList;
import java.util.EnumSet;
import java.util.List;

import org.apache.commons.lang.builder.ToStringBuilder;
import org.apache.commons.lang.builder.ToStringStyle;

public class EnumList<T extends Enum<T>> {
private List<T> list;

public void setEnumClass(Class<T> c) {
list = new ArrayList<T>();
list.addAll(EnumSet.allOf(c));
}

public List<T> getList() {
return list;
}

public String toString() {
return new ToStringBuilder(this, ToStringStyle.SHORT_PREFIX_STYLE)
.append("list", list).toString();

}
}

And the configure it on components.xml as follows:

<component name="myEnumComp" scope="application" auto-create="true">
<property name="enumClass">br.com.rafaelri.MyEnum</property>
</component>
<factory name="myEnum" value="#{myEnumComp.list}" scope="application" auto-create="true"/>

Finally you’ll refer to it on your xhtml as follows:

<h:selectOneMenu id="myEnumSelect" value="#{instanceHome.instance.myEnum}">
<s:selectItems var="enum" value="#{myEnum}" label="#{enum.description()}"/>
<s:convertEnum />
</h:selectOneMenu>


21
May
11

JPA 2 on Seam 2

As intelligently pointed by Thomas on CTP Java blog starting a Java Web application today from scratch presents you an interesting crossroads. crossroadsIf you stick with the proven Seam 2 option you end up also with its deficiencies not to mention some old versioned libraries. The other option is experimenting with Seam 3 but this on the other hand has few knowledge critical mass around. But with a few changes, Seam 2 can still provide value and at least Hibernate/JPA version can be improved bringing a bunch of bugfixes (eg.: proper schema generation that in Hibernate 3.3 was buggy – even though HHH1012 says it is fixed I never saw this on any 3.3 release) and new features (eg.: criteria api as also pointed by Thomas).

I’ve followed Thomas instructions but instead of coding the proxy method by method I suggest using Java Dynamic Proxies:


import java.lang.reflect.InvocationHandler;
import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;
import java.lang.reflect.Proxy;

import javax.persistence.EntityManager;

import org.jboss.seam.persistence.PersistenceProvider;

public class Jpa2EntityManagerProxy implements InvocationHandler {

private EntityManager delegate;

public Jpa2EntityManagerProxy(EntityManager delegate) {
super();
this.delegate = delegate;
}

public Object invoke(Object proxy, Method method, Object[] args)
throws Throwable {
if (method.getName().equals("getDelegate")) {
return PersistenceProvider.instance().proxyDelegate(
delegate.getDelegate());
} else {
try {
return method.invoke(delegate, args);
} catch (InvocationTargetException e) {
throw e.getTargetException();
}
}
}

public static EntityManager newInstance(EntityManager entityManager) {
return (EntityManager) Proxy.newProxyInstance(
Jpa2EntityManagerProxy.class.getClassLoader(),
new Class[] { EntityManager.class },
new Jpa2EntityManagerProxy(entityManager));
}
}

Apart from this change I followed Thomas instructions as-is and it got JPA2 with Hibernate 3.5 working in my project.

04
Feb
11

Java IO fundamentals

I am usually questioned by friends regarding Java IO. Questions usually revolve between byte and char conversion and what determines the codepage used. My answer to them is always the same: if you are using Streams you are dealing with bytes and if you are dealing with Readers and Writers then you are dealing with Chars (usually you specified codepage to be used or it is using system default). I usually tell them to distrust any class named SomethingStream that have any method that operates over Strings (eg.: ServletOutputStream) cause they’ll usually do some implicit conversion (either using system codepage or something else out of your control).
I’ve prepared a diagram to clarify the main Java IO classes (and their main methods also):

Basic Java IO Classes

Basic Java IO Classes


On the bottom you’ll find Stream classes, they operate on bytes so codepage here is irrelevant. In the middle you’ll find conversion classes: those are the ones that allow you to forcefully define a codepage, they are the bridge classes that allow you to plug a Writer on an OutputStream or a Reader on an InputStream. You may use them to specify, for example, a codepage for writing on a TCP socket. And lastly, on top, you’ll find Character (and therefore String) oriented classes: Reader and Writer.
You may have noticed that I have included the Buffered version of the Stream classes. The Output one provides better performance under certain scenarios. The BufferedInputStream on the other hand allows you to mark and rewind its content, something usually useful for implementing protocol interpreters.




ClustrMaps

Blog Stats

  • 371,099 hits since aug'08
Advertisements

%d bloggers like this: