Skip to content

allow users to specify float types from OneHotEncoder and ContinuousEncoder #565

@tiemvanderdeure

Description

@tiemvanderdeure

I am using a LinearPipeline consisting of a OneHotEncoder in combination with a NeuralNetworkClassifier, which I think is a basic yet really neat use case for linear pipelines.

However, Flux gives this warning:

┌ Warning: Layer with Float32 parameters got Float64 input.
│   The input will be converted, but any earlier layers may be very slow.
│   layer = Dense(11 => 16, relu)  # 192 parameters
│   summary(x) = "11×20 Matrix{Float64}"
└ @ Flux ~/.julia/packages/Flux/CUn7U/src/layers/stateless.jl:60

I know MLJ tries to abstract away from data types to use scitypes instead, but in this case it would be great to be able to specify the data type a OneHotEncoder gives as output.

For OneHotEncoder this happens to be really easy to do (it calls float), for a ContinuousEncoder it would be a bit trickier (it calls coerce).

I also don't know if it conflicts too much with the scitype philosophy or if there's an easier way around this, so I thought I'd ask here before making a PR (which I would gladly do).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions